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Assistant Commissioner for Patents 
Washington, DC 20231 

REQUEST FOR FILING CONTINUATION/DIVISIONAL APPLICATION 

This is a request for filing a divisional application 

to under 3 7 CFR 1.60, of pending prior application Serial No. 

ffl 08/623,679 filed on March 29, 1996 by Andrew W. Shyjan for 

1% * COMPOSITIONS AND METHODS FOR THE DIAGNOSIS, PREVENTION AND 

T* TREATMENT OF TUMOR PROGRESSION, which is a continuation-in-part 

L, of pending prior application Serial No. 08/412,431 filed on March 

5fj 29, 1995 be Andrew W. Shyjan. 

W 1. Enclosed is a true copy of the above identified 

Si- prior application, including the declaration as originally filed 

and a copy of the signed declaration filed June 24, 1996 to 
complete the prior application. The true copy of the prior 
application is as follows: 161 pages of specification; five t (5) 
page of claims; one (1) page of abstract; thirteen (13) sheets of 
drawings; and four (2) pages of declaration and power of 
attorney; Sequence Listing Statement under 37 CFR §1. 821(f) and 
2 9 pages of sequence listing. „ EXpRESS ^ Mailing ^ Number EM5 18286979us 

Date of Deposit May 23, 1997 



I hereby certify under 37 CFR 1.10 that this correspondence is being 
deposited with the United States Postal Service as "Express Mail 
Post Office To Addressee" with sufficient postage on the date 
indicated above and is addressed to the Assistant Commissioner for 
Patents, Waslstagton, D^T"S2023 
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Applicants also include a copy of the sequence listing in 
paper and computer readable form (updated to include parent 
application data) and a new statement under 37 CFR §1.821 (f) 
The new Sequence Listing contains no new matter. 

2. Cancel in this application claims 1 through 5 and 7 
through 13 of the prior application before calculating the filing 
fee. At least one original independent claim has been retained 
for the purpose of filing this application. 

3. The filing fee, based on the original claims in the 
prior application less any claims cancelled herein, is calculated 
below: 

CLAIMS AS FILED 

Number Number Basic 

Filed Extra Rate Fee 

$770 

Total Claims [14] - 20 = [0] x $ 11 = [***] 

Independent Claims [4] - 3 = [1] x $ 40 = [40] 

Multiple Dependent Claims $130 = [***] 

Total Filing Fee = $810.00 

4. Please amend the specification by replacing lines 5 
and 6, on page 1, with the sentence: --This is a divisional of 
copending application Serial No. 08/632,679, filed March 29, 
1996, which is a continuation-in-part of U.S. Serial No. 
08/412,431, filed March 29, 1995.-- 

5. With respect to the prior application from which 
this application claims benefit under 35 U.S.C. 120, the inventor 
named in this application is the same as the one named in the 
prior application. The inventorship for the claims in this 
application is the same. 
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6. The prior application is assigned of record to 
Millennium Pharmaceuticals, Inc., a Delaware corporation, by 
virtue of an assignment submitted to the Patent and Trademark 
Office for recording on June 26, 1996 at Reel 8003, Frame 0981. 

7. Enclosed is a check for the filing fee of $810.00. 

8. The Commissioner is hereby authorized to charge any 
additional fees which may be required by this paper, or credit 
any overpayment, to Deposit Account No. 06-1050. A duplicate of 
this request is enclosed. 

9. The power of attorney in the prior application is 
to J. Peter Fasse, Reg. No. 32,983. The power appears in the 
true copy of the prior application. Address all future 
communications to J. Peter Fasse at Fish & Richardson P.C., 225 
Franklin Street, Boston, MA 02110-2804. 



Respectfully submitted, 



Date : 




Fish & Richardson P.C. 
225 Franklin Street 
Boston, MA 02110-2804 

Telephone : 617/542-5070 
Facsimile : 617/542-8 906 

241351. Bll 
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I further declare that all statements made herein of my 

own knowledge are true and that all statements made on 

information and belief are believed to be true; and false 

statements and the like so made are punishable by fine or 

imprisonment, or both, under Section 1001 of Title 18 of The 

United States Code, and that such willful false statements may 

jeopardize the validity of the application or any patent issued 

thereon. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: Shyjan, Andrew W. 

<ii) TITLE OF INVENTION: COMPOSITIONS AND METHODS FOR THE 

DIAGNOSIS, PREVENTION AND TREATMENT OF TUMOR 
PROGRESSION 

(iii) NUMBER OF SEQUENCES: 9 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Fish & Richardson P.C. 

(B) STREET: 225 Franklin Street 

(C) CITY: Boston 

(D) STATE: MA 

(E) COUNTRY: USA 

(F) ZIP: 02110-2804 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS /MS-DOS 

(D) SOFTWARE: Patent In Release #1.0, Version #1.30 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 23-MAY-1997 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 08/623,679 

(B) FILING DATE: 29-MAR-1996 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 08/412,431 

(B) FILING DATE: 29-MAR-1995 

(viii) ATTORNEY /AGENT INFORMATION: 

(A) NAME: Fasse, J. Peter 

(B) REGISTRATION NUMBER: 32,983 

(C) REFERENCE/DOCKET NUMBER: 07334/004002 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 617/542-5070 

(B) TELEFAX: 617/542-8906 

(C) TELEX: 200154 

(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 186 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 
GGTGCTGGAG TACCTCATGG GCGGTGCCTA CCGCTGCAAC TACACTCGGA AAAGCTTCCG 60 



GACTCTCTAC AACAACTTGT TTGGCCCTAA GAGGGTAGAG CTCAGCAGAC ACACAGTGTC 120 



CTGTGCCTCC CAGAGTAACA TGTGGTTCCT TGATGTGCTT CCCCAAAAGC CCACCTGTGC 



180 



AGAATG 186 
(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2729 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

AAGGAGGCTA GGCTGCACCC TTCCCGCTTG CTCAGCAGCT GAGGCAGGGT CAGAAAGCAT 60 

GGATAGAGAA GACATTTTGC AAAAGGGAAT GCATCTTTGT AATTCCCAGT ACAAAAGACC 120 

CTAACAGATG TTGCTGTGGT CAGCTCACTA ACCAGCACAT CCCCCCTTTG CCGAGTGGGG 180 

CTCCCAGCAC AACAGGAGAG GACACCAAGC AGGCAGACAC GCAGTCCGGG AAATGGTCTG 240 

TCAGCAAACA CACCCAGAGC TACCCAACAG ACTCCTATGG GATTCTTGAA TTCCAGGGTG 300 

GGGGTTACTC CAATAAAGCC ATG TAC ATC CGA GTC TCC TAC GAC ACC AAG 350 

Met Tyr He Arg Val Ser Tyr Asp Thr Lys 
15 10 

CCA GAT TCC CTG CTC CAC CTC ATG GTG AAG GAC TGG CAG CTG GAG CTC 398 
Pro Asp Ser Leu Leu His Leu Met Val Lys Asp Trp Gin Leu Glu Leu 
15 20 25 

CCG AAG CTC TTG ATA TCT GTG CAC GGA GGC CTC CAA AGC TTC GAG ATG 446 
Pro Lys Leu Leu He Ser Val His Gly Gly Leu Gin Ser Phe Glu Met 
30 35 40 

CAG TCC AAA CTG AAG CAG GTG TTT GGG AAA GGT CTG ATC AAG GCT GCC 494 
Gin Ser Lys Leu Lys Gin Val Phe Gly Lys Gly Leu He Lys Ala Ala 
45 50 55 

ATG ACC ACG GGG GCG TGG ATC TTC ACC GGG GGT GTG AGC ACT GGT GTC 542 
Met Thr Thr Gly Ala Trp He Phe Thr Gly Gly Val Ser Thr Gly Val 
60 65 70 

GTC AGC CAT GTG GGG GAT GCC TTG AAA GAC CAC TCC TCC AAG TCC AGA 590 
Val Ser His Val Gly Asp Ala Leu Lys Asp His Ser Ser Lys Ser Arg 
75 80 85 90 

GGC CGG CTC TGT GCT ATA GGA ATT GCT CCC TGG GGC ATG GTG GAG AAC 638 
Gly Arg Leu Cys Ala He Gly He Ala Pro Trp Gly Met Val Glu Asn 
95 100 105 

AAG GAA GAC CTG ATT GGA AAA GAT GTA ACA AGA GTC TAT CAG ACC ATG 686 
Lys Glu Asp Leu He Gly Lys Asp Val Thr Arg Val Tyr Gin Thr Met 
110 115 120 

TCC AAC CCT CTG AGC AAG CTC TCT GTG CTC AAC AAT TCC CAC ACT CAC 734 
Ser Asn Pro Leu Ser Lys Leu Ser Val Leu Asn Asn Ser His Thr His 
125 130 135 

TTC ATC TTG GCT GAC AAC GGC ACC CTG GGC AAG TAT GGT GCT GAG GTG 782 
Phe He Leu Ala Asp Asn Gly Thr Leu Gly Lys Tyr Gly Ala Glu Val 
140 145 150 



AAG CTT CGA AGA CAG CTG GAA AAA CAC ATC TCC CTG CAG AAG ATC AAC 830 
Lys Leu Arg Arg Gin Leu Glu Lys His lie Ser Leu Gin Lys lie Asn 
155 160 165 170 

ACA AGG CTG GGC CAG GGT GTA CCT GTC GTG GGC CTA GTG GTA GAA GGT 878 
Thr Arg Leu Gly Gin Gly Val Pro Val Val Gly Leu Val Val Glu Gly 
175 180 185 

GGT CCT AAC GTG GTT TCT ATC GTC CTG GAG TAT CTC AAA GAA GAC CCT 926 
Gly Pro Asn Val Val Ser lie Val Leu Glu Tyr Leu Lys Glu Asp Pro 
190 195 200 

CCT GTC CCT GTG GTG GTT TGC GAT GGC AGT GGA CGT GCC TCT GAC ATT 974 
Pro Val Pro Val Val Val Cys Asp Gly Ser Gly Arg Ala Ser Asp lie 
205 210 215 

TTG TCC TTC GCA CAC AAA TAC TGC GAC GAA GGA GGA GTC ATA AAC GAG 1022 
Leu Ser Phe Ala His Lys Tyr Cys Asp Glu Gly Gly Val lie Asn Glu 
220 225 230 

TCC CTG CGG GAC CAG CTT CTA GTT ACC ATT CAG AAA ACA TTT AAT TAC 1070 
Ser Leu Arg Asp Gin Leu Leu Val Thr lie Gin Lys Thr Phe Asn Tyr 
235 240 245 250 

AGC AAG TCC CAG TCG TAT CAG CTG TTT GCA ATT ATC ATG GAG TGC ATG 1118 
Ser Lys Ser Gin Ser Tyr Gin Leu Phe Ala lie lie Met Glu Cys Met 
255 260 265 

AAG AAG AAA GAA CTC GTC ACT GTG TTT CGG ATG GGT TCC GAG GGT CAG 1166 
Lys Lys Lys Glu Leu Val Thr Val Phe Arg Met Gly Ser Glu Gly Gin 
270 275 280 

CAA GAT GTC GAG ATG GCA ATT TTA ACT GCC TTG CTC AAA GGA ACC AAC 1214 
Gin Asp Val Glu Met Ala lie Leu Thr Ala Leu Leu Lys Gly Thr Asn 
285 290 295 

GCA TCA GCT CCA GAT CAG CTG AGC TTG GCC CTG GCT TGG AAC CGG GTC 1262 
Ala Ser Ala Pro Asp Gin Leu Ser Leu Ala Leu Ala Trp Asn Arg Val 
300 305 310 

GAC ATA GCG CGA AGC CAG ATC TTC GTC TTT GGC CCA CAC TGG CCG CCA 1310 
Asp lie Ala Arg Ser Gin He Phe Val Phe Gly Pro His Trp Pro Pro 
315 320 325 330 

CTG GGA AGC CTG GCC CCT CCT GTG GAC ACC AAA GCC GCA GAG AAG GAA 1358 
Leu Gly Ser Leu Ala Pro Pro Val Asp Thr Lys Ala Ala Glu Lys Glu 
335 340 345 

AAG AAG CCA CCC ACA GCC ACC ACC AAG GGG AGA GGA AAA GGA AAA GGC 1406 
Lys Lys Pro Pro Thr Ala Thr Thr Lys Gly Arg Gly Lys Gly Lys Gly 
350 355 360 

AAG AAG AAA GGC AAA GTG AAA GAG GAA GTG GAG GAA GAG ACG GAC CCC 1454 
Lys Lys Lys Gly Lys Val Lys Glu Glu Val Glu Glu Glu Thr Asp Pro 
365 370 375 

CGG AAG CTT GAG CTG CTC AAC TGG GTG AAT GCC CTG GAG CAA GCC ATG 1502 
Arg Lys Leu Glu Leu Leu Asn Trp Val Asn Ala Leu Glu Gin Ala Met 
380 385 390 

CTG GAT GCT CTT GTC CTA GAT CGG GTG GAC TTT GTA AAG CTC CTG ATT 1550 
Leu Asp Ala Leu Val Leu Asp Arg Val Asp Phe Val Lys Leu Leu He 
395 400 405 410 



GAA AAC GGA GTG AAC ATG CAG CAT TTC CTC ACC ATC CCG AGG CTG GAG 1598 
Glu Asn Gly Val Asn Met Gin His Phe Leu Thr lie Pro Arg Leu Glu 
415 420 425 

GAG CTA TAC AAC ACC AGA CTG GGC CCA CCA AAC ACC CTT CAT CTG CTG 1646 
Glu Leu Tyr Asn Thr Arg Leu Gly Pro Pro Asn Thr Leu His Leu Leu 
430 435 440 

GTG CGG GAT GTA AAG AAG AGC AAC CTT CCA CCT GAT TAC CAC ATC AGC 1694 
Val Arg Asp Val Lys Lys Ser Asn Leu Pro Pro Asp Tyr His lie Ser 
445 450 455 

CTC ATT GAT ATA GGA CTG GTG CTG GAG TAC CTC ATG GGC GGT GCC TAC 1742 
Leu lie Asp He Gly Leu Val Leu Glu Tyr Leu Met Gly Gly Ala Tyr 
460 465 470 

CGC TGC AAC TAC ACT CGG AAA AGC TTC CGG ACT CTC TAC AAC AAC TTG 1790 
Arg Cys Asn Tyr Thr Arg Lys Ser Phe Arg Thr Leu Tyr Asn Asn Leu 
475 480 485 490 

TTT GGC CCT AAG AGG GTA GAG CTC AGC AGA CAC ACA GTG TCC TGT GCC 1838 
Phe Gly Pro Lys Arg Val Glu Leu Ser Arg His Thr Val Ser Cys Ala 
495 500 505 

TCC CAG AGT AAC ATG TGG TTC CTT GAT GTG CTT CCC CAA AAG CCC ACC 1886 
Ser Gin Ser Asn Met Trp Phe Leu Asp Val Leu Pro Gin Lys Pro Thr 
510 515 520 

TGT GCA GAA TGC AAC TCT TCA CCT CAC CTG TCC CAA ACT GAC ATC ACC 1934 
Cys Ala Glu Cys Asn Ser Ser Pro His Leu Ser Gin Thr Asp He Thr 
525 530 535 

CCA CCT CTG CCC T GACACCCAGT GCAGGGCCTC CTAGCTTTCA CATGCAGCCA 1987 
Pro Pro Leu Pro 
540 



TTCACATCGC 


CTCTCAAGAC 


TGGGCCAGGC 


AGTGCAACCT 


GTCAAGCATG 


TCTGTCCTCC 


2047 


CCTCCTTCCT 


ACAATAGCCC 


CCCCTCTGGG 


CCCCATGCCT 


CTGCTCTCTC 


AGCCCGTTCT 


2107 


CCTCCCCACT 


GATCACTGGC 


GCTCCTGTTG 


TCTTCCAAGG 


CAAGGAACAA 


GGAAAAGCAT 


2167 


CTTTTTGCCC 


ACAAAAGTTT 


AGGGCTCCCC 


GCTGTTCAAC 


CATAGCCAAC 


CTCACTGTAC 


2227 


ATCGGAGTCA 


TCCAGGCCAG 


CTGCCACACA 


CAAGCCTTCC 


CCACCCTATC 


CCAATAGACC 


2287 


CTATTCCTCC 


ATCAAAATCA 


AAGCTAACTC 


CTGGCCTGCC 


ACATTGCTTC 


TTCTTGCTCC 


2347 


AGCCTGTTAA 


ACCTCCAATA 


AATGTCAGAT 


CTGTGGGAAG 


CCTTCCTCAC 


TCTCACTCCA 


2407 


CAGTTTGTAC 


AGAGAGCGAG 


AGCCTCGTTT 


GGTTCTACTT 


ACAAGGAAGG 


CTTTGTGTCT 


2467 


GTCTGTCCTT 


CCCAACTGAC 


TTCTGTTGAC 


AGAAGCAGTT 


TCCACATGAA 


AGCGTTGACT 


2527 


CACCTGGATG 


TTGTCATTAA 


TTAATAGTGA 


TACAAAATAT 


TGACACTTCT 


TTTCCTGCTT 


2587 


CTTTGTTATG 


CAGCCGAAAG 


CACTTAAGCT 


TCTGGGAATG 


GAAGTAAGTA 


GGACATGTTT 


2647 


GTGGCAGTTT 


ATTTACTATA 


TATACCTTTG 


TCATTCTGTG 


GAAGCAAAAA 


TTGCAATGTT 


2707 


TTCCATGAAT 


AAAGCTCGTG 


CC 








2729 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 542 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

Met Tyr lie Arg Val Ser Tyr Asp Thr Lys Pro Asp Ser Leu Leu His 
15 10 15 

Leu Met Val Lys Asp Trp Gin Leu Glu Leu Pro Lys Leu Leu lie Ser 
20 25 30 

Val His Gly Gly Leu Gin Ser Phe Glu Met Gin Ser Lys Leu Lys Gin 
35 40 45 

Val Phe Gly Lys Gly Leu lie Lys Ala Ala Met Thr Thr Gly Ala Trp 
50 55 60 

He Phe Thr Gly Gly Val Ser Thr Gly Val Val Ser His Val Gly Asp 
65 70 75 80 

Ala Leu Lys Asp His Ser Ser Lys Ser Arg Gly Arg Leu Cys Ala He 
85 90 95 

Gly He Ala Pro Trp Gly Met Val Glu Asn Lys Glu Asp Leu He Gly 
100 105 110 

Lys Asp Val Thr Arg Val Tyr Gin Thr Met Ser Asn Pro Leu Ser Lys 
115 120 125 

Leu Ser Val Leu Asn Asn Ser His Thr His Phe He Leu Ala Asp Asn 
130 135 140 

Gly Thr Leu Gly Lys Tyr Gly Ala Glu Val Lys Leu Arg Arg Gin Leu 
145 150 155 160 

Glu Lys His He Ser Leu Gin Lys He Asn Thr Arg Leu Gly Gin Gly 
165 170 175 

Val Pro Val Val Gly Leu Val Val Glu Gly Gly Pro Asn Val Val Ser 
180 185 190 

He Val Leu Glu Tyr Leu Lys Glu Asp Pro Pro Val Pro Val Val Val 
195 200 205 

Cys Asp Gly Ser Gly Arg Ala Ser Asp He Leu Ser Phe Ala His Lys 
210 215 220 

Tyr Cys Asp Glu Gly Gly Val He Asn Glu Ser Leu Arg Asp Gin Leu 
225 230 235 240 

Leu Val Thr He Gin Lys Thr Phe Asn Tyr Ser Lys Ser Gin Ser Tyr 
245 250 255 

Gin Leu Phe Ala He He Met Glu Cys Met Lys Lys Lys Glu Leu Val 
260 265 270 

Thr Val Phe Arg Met Gly Ser Glu Gly Gin Gin Asp Val Glu Met Ala 
275 280 285 



lie Leu Thr Ala Leu Leu Lys Gly Thr Asn Ala Ser Ala Pro Asp Gin 
290 295 300 

Leu Ser Leu Ala Leu Ala Trp Asn Arg Val Asp lie Ala Arg Ser Gin 
305 310 315 320 

lie Phe Val Phe Gly Pro His Trp Pro Pro Leu Gly Ser Leu Ala Pro 
325 330 335 

Pro Val Asp Thr Lys Ala Ala Glu Lys Glu Lys Lys Pro Pro Thr Ala 
340 345 350 

Thr Thr Lys Gly Arg Gly Lys Gly Lys Gly Lys Lys Lys Gly Lys Val 
355 360 365 

Lys Glu Glu Val Glu Glu Glu Thr Asp Pro Arg Lys Leu Glu Leu Leu 
370 375 380 

Asn Trp Val Asn Ala Leu Glu Gin Ala Met Leu Asp Ala Leu Val Leu 
385 390 395 400 

Asp Arg Val Asp Phe Val Lys Leu Leu lie Glu Asn Gly Val Asn Met 
405 410 415 

Gin His Phe Leu Thr lie Pro Arg Leu Glu Glu Leu Tyr Asn Thr Arg 
420 425 430 

Leu Gly Pro Pro Asn Thr Leu His Leu Leu Val Arg Asp Val Lys Lys 
435 440 445 

Ser Asn Leu Pro Pro Asp Tyr His lie Ser Leu lie Asp lie Gly Leu 
450 455 460 

Val Leu Glu Tyr Leu Met Gly Gly Ala Tyr Arg Cys Asn Tyr Thr Arg 
465 470 475 480 

Lys Ser Phe Arg Thr Leu Tyr Asn Asn Leu Phe Gly Pro Lys Arg Val 
485 490 495 

Glu Leu Ser Arg His Thr Val Ser Cys Ala Ser Gin Ser Asn Met Trp 
500 505 510 

Phe Leu Asp Val Leu Pro Gin Lys Pro Thr Cys Ala Glu Cys Asn Ser 
515 520 525 

Ser Pro His Leu Ser Gin Thr Asp lie Thr Pro Pro Leu Pro 
530 535 540 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:4: 

GGGGAAGCAC ATCAAGGAAC 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 23 base pairs 



(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
GCAACTACTA CACTCGGAAA AGC 23 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4944 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 346,.4837 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

ACTCATTATA GGGNTCGAGC GGCCGCCCGG GCAGGTTTGA GCTGTGCCCT CTCCATTCCA 60 

CTGCTGTGGC AGGGTCAGAA ATCTTGGATA GAGAAAACCT TTTGCAAACG GGAATGTATC 120 

TTTGTAATTC CTAGCACGAA AGACTCTAAC AGGTGTTGCT GTGGCCAGTT CACCAACCAG 180 

CATATCCCCC CTCTGCCAAG TGCAACACCC AGCAAAAATG AAGAGGAAAG CAAACAGGTG 240 

GAGACTCAGC CTGAGAAATG GTCTGTTGCC AAGCACACCC AGAGCTACCC AACAGATTCC 300 

TATGGAGTTC TTGAATTCCA GGGTGGCGGA TATTCCAATA AAGCC ATG TAT ATC 354 

Met Tyr lie 
1 

CGT GTA TCC TAT GAC ACC AAG CCA GAC TCA CTG CTC CAT CTC ATG GTG 402 
Arg Val Ser Tyr Asp Thr Lys Pro Asp Ser Leu Leu His Leu Met Val 
5 10 15 

AAA GAT TGG CAG CTG GAA CTC CCC AAG CTC TTA ATA TCT GTG CAT GGA 450 
Lys Asp Trp Gin Leu Glu Leu Pro Lys Leu Leu lie Ser Val His Gly 
20 25 30 35 

GGC CTC CAG AAC TTT GAG ATG CAG CCC AAG CTG AAA CAA GTC TTT GGG 498 
Gly Leu Gin Asn Phe Glu Met Gin Pro Lys Leu Lys Gin Val Phe Gly 
40 45 50 

AAA GGC CTG ATC AAG GCT GCT ATG ACC ACC GGG GCC TGG ATC TTC ACC 546 
Lys Gly Leu lie Lys Ala Ala Met Thr Thr Gly Ala Trp lie Phe Thr 
55 60 65 

GGG GGT GTC AGC ACA GGT GTT ATC AGC CAC GTA GGG GAT GCC TTG AAA 594 
Gly Gly Val Ser Thr Gly Val lie Ser His Val Gly Asp Ala Leu Lys 
70 75 80 

GAC CAC TCC TCC AAG TCC AGA GGC CGG GTT TGT GCT ATA GGA ATT GCT 642 
Asp His Ser Ser Lys Ser Arg Gly Arg Val Cys Ala lie Gly lie Ala 
85 90 95 



CCA TGG GGC ATC GTG GAG AAT AAG GAA GAC CTG GTT GGA AAG GAT GTA 690 
Pro Trp Gly lie Val Glu Asn Lys Glu Asp Leu Val Gly Lys Asp Val 
100 105 110 115 

ACA AGA GTG TAC CAG ACC ATG TCC AAC CCT CTA AGT AAG CTC TCT GTG 738 
Thr Arg Val Tyr Gin Thr Met Ser Asn Pro Leu Ser Lys Leu Ser Val 
120 125 130 

CTC AAC AAC TCC CAC ACC CAC TTC ATC CTG GCT GAC AAT GGC ACC CTG 786 
Leu Asn Asn Ser His Thr His Phe lie Leu Ala Asp Asn Gly Thr Leu 
135 140 145 

GGC AAG TAT GGC GCC GAG GTG AAG CTG CGA AGG CTG CTG GAA AAG CAC 834 
Gly Lys Tyr Gly Ala Glu Val Lys Leu Arg Arg Leu Leu Glu Lys His 
150 155 160 

ATC TCC CTC CAG AAG ATC AAC ACA AGA CTG GGG CAG GGC GTG CCC CTC 882 
lie Ser Leu Gin Lys lie Asn Thr Arg Leu Gly Gin Gly Val Pro Leu 
165 170 175 

GTG GGT CTC GTG GTG GAG GGG GGC CCT AAC GTG GTG TCC ATC GTC TTG 930 
Val Gly Leu Val Val Glu Gly Gly Pro Asn Val Val Ser lie Val Leu 
180 185 190 195 

GAA TAC CTG CAA GAA GAG CCT CCC ATC CCT GTG GTG ATT TGT GAT GGC 978 
Glu Tyr Leu Gin Glu Glu Pro Pro lie Pro Val Val lie Cys Asp Gly 
200 205 210 

AGC GGA CGT GCC TCG GAC ATC CTG TCC TTT GCG CAC AAG TAC TGT GAA 1026 
Ser Gly Arg Ala Ser Asp lie Leu Ser Phe Ala His Lys Tyr Cys Glu 
215 220 225 

GAA GGC GGA ATA ATA AAT GAG TCC CTC AGG GAG CAG CTT CTA GTT ACC 1074 
Glu Gly Gly He He Asn Glu Ser Leu Arg Glu Gin Leu Leu Val Thr 
230 235 240 

ATT CAG AAA ACA TTT AAT TAT AAT AAG GCA CAA TCA CAT CAG CTG TTT 1122 
He Gin Lys Thr Phe Asn Tyr Asn Lys Ala Gin Ser His Gin Leu Phe 
245 250 255 

GCA ATT ATA ATG GAG TGC ATG AAG AAG AAA GAA CTC GTC ACT GTG TTC 1170 
Ala He He Met Glu Cys Met Lys Lys Lys Glu Leu Val Thr Val Phe 
260 265 270 275 

AGA ATG GGT TCT GAG GGC CAG CAG GAC ATC GAG ATG GCA ATT TTA ACT 1218 
Arg Met Gly Ser Glu Gly Gin Gin Asp He Glu Met Ala He Leu Thr 
280 285 290 

GCC CTG CTG AAA GGA ACA AAC GTA TCT GCT CCA GAT CAG CTG AGC TTG 1266 
Ala Leu Leu Lys Gly Thr Asn Val Ser Ala Pro Asp Gin Leu Ser Leu 
295 300 305 

GCA CTG GCT TGG AAC CGC GTG GAC ATA GCA CGA AGC CAG ATC TTT GTC 1314 
Ala Leu Ala Trp Asn Arg Val Asp He Ala Arg Ser Gin He Phe Val 
310 315 320 

TTT GGG CCC CAC TGG ACG CCC CTG GGA AGC CTG GCA CCC CCG ACG GAC 1362 

Phe Gly Pro His Trp Thr Pro Leu Gly Ser Leu Ala Pro Pro Thr Asp 
325 330 335 

AGC AAA GCC ACG GAG AAG GAG AAG AAG CCA CCC ATG GCC ACC ACC AAG 1410 
Ser Lys Ala Thr Glu Lys Glu Lys Lys Pro Pro Met Ala Thr Thr Lys 
340 345 350 355 



66A GGA AGA 6GA AAA GGG AAA GGC AA6 AA6 AAA GGG AAA GTG AAA GAG 1458 
Gly Gly Arg Gly Lys Gly Lys Gly Lys Lys Lys Gly Lys Val Lys Glu 
360 365 370 

GAA GTG GAG GAA GAA ACT GAC CCC GGG AAG ATA GAG CTG CTG AAC TGG 1506 
Glu Val Glu Glu Glu Thr Asp Pro Arg Lys lie Glu Leu Leu Asn Trp 
375 380 385 

GTG AAT GCT TTG GAG CAA GCG ATG CTA GAT GCT TTA GTC TTA GAT CGT 1554 
Val Asn Ala Leu Glu Gin Ala Met Leu Asp Ala Leu Val Leu Asp Arg 
390 395 400 

GTC GAC TTT GTG AAG CTC CTG ATT GAA AAC GGA GTG AAC ATG CAA CAC 1602 
Val Asp Phe Val Lys Leu Leu lie Glu Asn Gly Val Asn Met Gin His 
405 410 415 

TTT CTG ACC ATT CCG AGG CTG GAG GAG CTT TAT AAC ACA AGA CTG GGT 1650 
Phe Leu Thr lie Pro Arg Leu Glu Glu Leu Tyr Asn Thr Arg Leu Gly 
420 425 430 435 

CCA CCA AAC ACA CTT CAT CTG CTG GTG AGG GAT GTG AAA AAG AGC AAC 1698 
Pro Pro Asn Thr Leu His Leu Leu Val Arg Asp Val Lys Lys Ser Asn 
440 445 450 

CTT CCG CCT GAT TAC CAC ATC AGC CTC ATA GAC ATC GGG CTC GTG CTG 1746 
Leu Pro Pro Asp Tyr His lie Ser Leu lie Asp lie Gly Leu Val Leu 
455 460 465 

GAG TAC CTC ATG GGA GGA GCC TAC CGC TGC AAC TAC ACT CGG AAA AAC 1794 
Glu Tyr Leu Met Gly Gly Ala Tyr Arg Cys Asn Tyr Thr Arg Lys Asn 
470 475 480 

TTT CGG ACC CTT TAC AAC AAC TTG TTT GGA CCA AAG AGG CCT AAA GCT 1842 
Phe Arg Thr Leu Tyr Asn Asn Leu Phe Gly Pro Lys Arg Pro Lys Ala 
485 490 495 

CTT AAA CTT CTG GGA ATG GAA GAT GAT GAG CCT CCA GCT AAA GGG AAG 1890 
Leu Lys Leu Leu Gly Met Glu Asp Asp Glu Pro Pro Ala Lys Gly Lys 
500 505 510 515 

AAA AAA AAA AAA AAG AAA AAG GAG GAA GAG ATC GAC ATT GAT GTG GAC 1938 
Lys Lys Lys Lys Lys Lys Lys Glu Glu Glu lie Asp lie Asp Val Asp 
520 525 530 

GAC CCT GCC GTG AGT CGG TTC CAG TAT CCC TTC CAC GAG CTG ATG GTG 1986 
Asp Pro Ala Val Ser Arg Phe Gin Tyr Pro Phe His Glu Leu Met Val 
535 540 545 

TGG GCA GTG CTG ATG AAA CGC CAG AAA ATG GCA GTG TTC CTC TGG CAG 2034 
Trp Ala Val Leu Met Lys Arg Gin Lys Met Ala Val Phe Leu Trp Gin 
550 555 560 

CGA GGG GAA GAG AGC ATG GCC AAG GCC CTG GTG GCC TGC AAG CTC TAC 2082 
Arg Gly Glu Glu Ser Met Ala Lys Ala Leu Val Ala Cys Lys Leu Tyr 
565 570 575 

AAG GCC ATG GCC CAC GAG TCC TCC GAG AGT GAT CTG GTG GAT GAC ATC 2130 
Lys Ala Met Ala His Glu Ser Ser Glu Ser Asp Leu Val Asp Asp lie 
580 585 590 595 

TCC CAG GAC TTG GAT AAC AAT TCC AAA GAC TTC GGC CAG CTT GCT TTG 2178 
Ser Gin Asp Leu Asp Asn Asn Ser Lys Asp Phe Gly Gin Leu Ala Leu 
600 605 610 



GAG TTA TTA GAC CAG TCC TAT AAG CAT GAC GAG CAG ATC GCT ATG AAA 2226 
Glu Leu Leu Asp Gin Ser Tyr Lys His Asp Glu Gin lie Ala Met Lys 
615 620 625 

CTC CTG ACC TAG GAG CTG AAA AAC TGG AGO AAC TCG ACC TGC CTC AAA 2274 
Leu Leu Thr Tyr Glu Leu Lys Asn Trp Ser Asn Ser Thr Cys Leu Lys 
630 635 640 

CTG GCC GTG GCA GCC AAA CAC CGG GAC TTC ATT GCT CAC ACC TGC AGC 2322 
Leu Ala Val Ala Ala Lys His Arg Asp Phe lie Ala His Thr Cys Ser 
645 650 655 

CAG ATG CTG CTG ACC GAT ATG TGG ATG GGA AGA CTG CGG ATG CGG AAG 2370 
Gin Met Leu Leu Thr Asp Met Trp Met Gly Arg Leu Arg Met Arg Lys 
660 665 670 675 

AAC CCC GGC CTG AAG GTT ATC ATG GGG ATT CTT CTA CCC CCC ACC ATC 2418 
Asn Pro Gly Leu Lys Val lie Met Gly lie Leu Leu Pro Pro Thr lie 
680 685 690 

TTG TTT TTG GAA TTT CGC ACA TAT GAT GAT TTC TCG TAT CAA ACA TCC 2466 
Leu Phe Leu Glu Phe Arg Thr Tyr Asp Asp Phe Ser Tyr Gin Thr Ser 
695 700 705 

AAG GAA AAC GAG GAT GGC AAA GAA AAA GAA GAG GAA AAT ACG GAT GCA 2514 
Lys Glu Asn Glu Asp Gly Lys Glu Lys Glu Glu Glu Asn Thr Asp Ala 
710 715 720 

AAT GCA GAT GCT GGC TCA AGA AAG GGG GAT GAG GAG AAC GAG CAT AAA 2562 
Asn Ala Asp Ala Gly Ser Arg Lys Gly Asp Glu Glu Asn Glu His Lys 
725 730 735 

AAA CAG AGA ATT ATC CCC ATC GGA ACA AAA ATC TGT AAA TTC TAT AAC 2610 
Lys Gin Arg lie lie Pro lie Gly Thr Lys lie Cys Lys Phe Tyr Asn 
740 745 750 755 

GCG CCC ATT GTC AAG TTC TGG TTT TAC ACA ATA TCA TAC TTG GGC TAC 2658 
Ala Pro lie Val Lys Phe Trp Phe Tyr Thr lie Ser Tyr Leu Gly Tyr 
760 765 770 

CTG CTG CTG TTT AAC TAC GTC ATC CTG GTG CGG ATG GAT GGC TGG CCG 2706 
Leu Leu Leu Phe Asn Tyr Val He Leu Val Arg Met Asp Gly Trp Pro 
775 780 785 

TCC CTC CAG GAG TGG ATC GTC ATC TCC TAC ATC GTG AGC CTG GCG TTA 2754 
Ser Leu Gin Glu Trp He Val He Ser Tyr He Val Ser Leu Ala Leu 
790 795 800 

GAG AAG ATA CGA GAG ATC CTC ATG TCA GAA CCA GGC AAA CTC AGC CAG 2802 
Glu Lys He Arg Glu He Leu Met Ser Glu Pro Gly Lys Leu Ser Gin 
805 810 815 

AAA ATC AAA GTT TGG CTT CAG GAG TAC TGG AAC ATC ACA GAT CTC GTG 2850 
Lys He Lys Val Trp Leu Gin Glu Tyr Trp Asn He Thr Asp Leu Val 
820 825 830 835 

GCC ATT TCC ACA TTC ATG ATT GGA GCA ATG GCC ACG AGA TCT GTG ATG 2898 
Ala He Ser Thr Phe Met He Gly Ala Met Ala Thr Arg Ser Val Met 
840 845 850 

ATG ATT GGA AAG ATG ATG ATC GAC ATG CTG TAC TTT GTG GTC ATC ATG 2946 
Met He Gly Lys Met Met He Asp Met Leu Tyr Phe Val Val He Met 
855 860 865 



CTG GTC GTG CTC ATG AGT TTC GGA GTA GCC OGT CAA GCC ATT CTG CAT 2994 
Leu Val Val Leu Met Ser Phe Gly Val Ala Arg Gin Ala lie Leu His 
870 875 880 

CCA GAG GAG AAG CCC TCT TGG AAA CTG GCC CGA AAC ATC TTC TAC ATG 3042 
Pro Glu Glu Lys Pro Ser Trp Lys Leu Ala Arg Asn lie Phe Tyr Met 
885 890 895 

CCC TAC TGG ATG ATC TAT GGA GAG GTG TTT GCA GAC CAG ATA GAC CTC 3090 
Pro Tyr Trp Met lie Tyr Gly Glu Val Phe Ala Asp Gin lie Asp Leu 
900 905 910 915 

TAC GCC ATG GAA ATT AAT CCT CCT TGT GGT GAG AAC CTA TAT GAT GAG 3138 
Tyr Ala Met Glu lie Asn Pro Pro Cys Gly Glu Asn Leu Tyr Asp Glu 
920 925 930 

GAG GGC AAG CGG CTT CCT CCC TGT ATC CCC GGC GCC TGG CTC ACT CCA 3186 
Glu Gly Lys Arg Leu Pro Pro Cys lie Pro Gly Ala Trp Leu Thr Pro 
935 940 945 

GCA CTC ATG GCG TGC TAT CTA CTG GTC GCC AAC ATC CTG CTG GTG AAC 3234 
Ala Leu Met Ala Cys Tyr Leu Leu Val Ala Asn lie Leu Leu Val Asn 
950 955 960 

CTG CTG ATT GCT GTG TTC AAC AAT ACC TTC TTT GAA GTA AAA TCA ATA 3282 
Leu Leu lie Ala Val Phe Asn Asn Thr Phe Phe Glu Val Lys Ser lie 
965 970 975 

TCC AAC CAG GTG TGG AAG TTC CAG CGA TAT CAG CTG ATT ATG ACA TTT 3330 
Ser Asn Gin Val Trp Lys Phe Gin Arg Tyr Gin Leu lie Met Thr Phe 
980 985 990 995 

CAT GAC AGG CCA GTC CTG CCC CCA CCG ATG ATC ATT TTA AGC CAC ATC 3378 
His Asp Arg Pro Val Leu Pro Pro Pro Met lie lie Leu Ser His lie 
1000 1005 1010 

TAC ATC ATC ATT ATG CGT CTC AGC GGC CGC TGC AGG AAA AAG AGA GAA 3426 
Tyr lie lie lie Met Arg Leu Ser Gly Arg Cys Arg Lys Lys Arg Glu 
1015 1020 1025 

GGG GAC CAA GAG GAA CGG GAT CGT GGA TTG AAG CTC TTC CTT AGC GAC 3474 
Gly Asp Gin Glu Glu Arg Asp Arg Gly Leu Lys Leu Phe Leu Ser Asp 
1030 1035 1040 

GAG GAG CTA AAG AGG CTG CAT GAG TTC GAG GAG CAG TGC GTG CAG GAG 3522 
Glu Glu Leu Lys Arg Leu His Glu Phe Glu Glu Gin Cys Val Gin Glu 
1045 1050 1055 

CAC TTC CGG GAG AAG GAG GAT GAG CAG CAG TCG TCC AGC GAC GAG CGC 3570 
His Phe Arg Glu Lys Glu Asp Glu Gin Gin Ser Ser Ser Asp Glu Arg 
1060 1065 1070 1075 

ATC CGG GTC ACT TCT GAA AGA GTT GAA AAT ATG TCA ATG AGG TTG GAA 3618 
lie Arg Val Thr Ser Glu Arg Val Glu Asn Met Ser Met Arg Leu Glu 
1080 1085 1090 

GAA ATC AAT GAA AGA GAA ACT TTT ATG AAA ACT TCC CTG CAG ACT GTT 3666 
Glu lie Asn Glu Arg Glu Thr Phe Met Lys Thr Ser Leu Gin Thr Val 
1095 1100 1105 



GAC CTT CGA CTT GCT CAG CTA GAA GAA TTA TCT AAC AGA ATG GTG AAT 
Asp Leu Arg Leu Ala Gin Leu Glu Glu Leu Ser Asn Arg Met Val Asn 
1110 1115 1120 



3714 



GCT CTT GAA AAT CTT GCG GGA ATC GAC AGG TCT GAC CTG ATC CAG GCA 3762 
Ala Leu Glu Asn Leu Ala Gly lie Asp Arg Ser Asp Leu lie Gin Ala 
1125 1130 1135 

CGG TCC CGG GCT TCT TCT GAA TGT GAG GCA ACG TAT CTT CTC CGG CAA 3810 
Arg Ser Arg Ala Ser Ser Glu Cys Glu Ala Thr Tyr Leu Leu Arg Gin 
1140 1145 1150 1155 

AGC AGC ATC AAT AGC GCT GAT GGC TAG AGC TTG TAT CGA TAT CAT TTT 3858 
Ser Ser He Asn Ser Ala Asp Gly Tyr Ser Leu Tyr Arg Tyr His Phe 
1160 1165 1170 

AAC GGA GAA GAG TTA TTA TTT GAG GAT ACA TCT CTC TCC ACG TCA CCA 3906 
Asn Gly Glu Glu Leu Leu Phe Glu Asp Thr Ser Leu Ser Thr Ser Pro 
1175 1180 1185 

GGG ACA GGA GTC AGG AAA AAA ACC TGT TCC TTC CGT ATA AAG GAA GAG 3954 
Gly Thr Gly Val Arg Lys Lys Thr Cys Ser Phe Arg He Lys Glu Glu 
1190 1195 1200 

AAG GAC GTG AAA ACG CAC CTA GTC CCA GAA TGT CAG AAC AGT CTT CAC 4002 
Lys Asp Val Lys Thr His Leu Val Pro Glu Cys Gin Asn Ser Leu His 
1205 1210 1215 

CTT TCA CTG GGC ACA AGC ACA TCA GCA ACC CCA GAT GGC AGT CAC CTT 4050 
Leu Ser Leu Gly Thr Ser Thr Ser Ala Thr Pro Asp Gly Ser His Leu 
1220 1225 1230 1235 

GCA GTA GAT GAC TTA AAG AAC GCT GAA GAG TCA AAA TTA GGT CCA GAT 4098 
Ala Val Asp Asp Leu Lys Asn Ala Glu Glu Ser Lys Leu Gly Pro Asp 
1240 1245 1250 

ATT GGG ATT TCA AAG GAA GAT GAT GAA AGA CAG ACA GAC TCT AAA AAA 4146 
He Gly He Ser Lys Glu Asp Asp Glu Arg Gin Thr Asp Ser Lys Lys 
1255 1260 1265 

GAA GAA ACT ATT TCC CCA AGT TTA AAT AAA ACA GAT GTG ATA CAT GGA 4194 
Glu Glu Thr He Ser Pro Ser Leu Asn Lys Thr Asp Val He His Gly 
1270 1275 1280 

CAG GAC AAA TCA GAT GTT CAA AAC ACT CAG CTA ACA GTG GAA ACG ACA 4242 
Gin Asp Lys Ser Asp Val Gin Asn Thr Gin Leu Thr Val Glu Thr Thr 
1285 1290 1295 

AAT ATA GAA GGC ACT ATT TCC TAT CCC CTG GAA GAA ACC AAA ATT ACA 4290 
Asn He Glu Gly Thr He Ser Tyr Pro Leu Glu Glu Thr Lys He Thr 
1300 1305 1310 1315 

CGC TAT TTC CCC GAT GAA ACG ATC AAT GCT TGT AAA ACA ATG AAG TCC 4338 
Arg Tyr Phe Pro Asp Glu Thr He Asn Ala Cys Lys Thr Met Lys Ser 
1320 1325 1330 

AGA AGC TTC GTC TAT TCC CGG GGA AGA AAG CTG GTC GGT GGG GTT AAC 4386 
Arg Ser Phe Val Tyr Ser Arg Gly Arg Lys Leu Val Gly Gly Val Asn 
1335 1340 1345 

CAG GAT GTA GAG TAC AGT TCA ATC ACG GAC CAG CAA TTG ACG ACG GAA 4434 
Gin Asp Val Glu Tyr Ser Ser He Thr Asp Gin Gin Leu Thr Thr Glu 
1350 1355 1360 

TGG CAA TGC CAA GTT CAA AAG ATC ACG CGC TCT CAT AGC ACA GAT ATT 4482 
Trp Gin Cys Gin Val Gin Lys He Thr Arg Ser His Ser Thr Asp He 
1365 1370 1375 



CCT TAC 
Pro Tyr 
1380 


ATT 
He 


GTG 
Val 


TCG 
Ser 


GAA GCT 
Glu Ala 
1385 


GCA 
Ala 


GTG 
Val 


CAA 
Gin 


GCT GAG 
Ala Glu 
1390 


CAA 
Gin 


AAA 
Lys 


GAG 
Glu 


CAG 
Gin 
1395 


4530 


TTT 
Phe 


GCA 
Ala 


GAT 
Asp 


ATG CAA GAT 
Met Gin Asp 
1400 


GAA 
Glu 


CAC 
His 


CAT 
His 


GTC GCT 
Val Ala 
1405 


GAA 
Glu 


GCA 
Ala 


ATT 
He 


CCT CGA 
Pro Arg 
1410 


4578 


ATC 
He 


CCT 
Pro 


CGC 
Arg 


TTG TCC 
Leu Ser 
1415 


CTA 
Leu 


ACC 
Thr 


ATT 
He 


ACT GAC 
Thr Asp 
1420 


AGA 
Arg 


AAT 
Asn 


GGG 
Gly 


ATG GAA 
Met Glu 
1425 


AAC 
Asn 


4626 


TTA 
Leu 


CTG 
Leu 


TCT GTG 
Ser Val 
1430 


AAG 
Lys 


CCA 
Pro 


GAT 
Asp 


CAA ACT 
Gin Thr 
1435 


TTG 
Leu 


GGA 
Gly 


TTC 
Phe 


CCA TCT 
Pro Ser 
1440 


CTC 
Leu 


AGG 
Arg 


4674 


TCA 
Ser 


AAA AGT 
Lys Ser 
1445 


TTA 
Leu 


CAT 
His 


GGA 
Gly 


CAT CCT AGG 
His Pro Arg 
1450 


AAT 
Asn 


GTG 
Val 


AAA TCC 
Lys Ser 
1455 


ATT 
He 


CAG 
Gin 


GGA 
Gly 


4722 


AAG TTA GAC 
Lys Leu Asp 
1460 


AGA 
Arg 


TCT 
Ser 


GGA CAT 
Gly His 
1465 


GCC 
Ala 


AGT 
Ser 


AGT 
Ser 


GTA AGC 
Val Ser 
1470 


AGC 
Ser 


TTA 
Leu 


GTA 
Val 


ATT 
He 
1475 


4770 


GTG 
Val 


TCT GGA 
Ser Gly 


ATG 
Met 


ACA GCA GAA GAA AAA 
Thr Ala Glu Glu Lys 
1480 


AAG GTT 
Lys Val 
1485 


AAG 
Lys 


AAA 

Lys 


GAG 
Glu 


AAA GCT 
Lys Ala 
1490 


4818 



TCC ACA GAA ACT GAA TGC T AGTCTGTTTT GTTTCTTTAA TTTTTTTTTT 4867 
Ser Thr Glu Thr Glu Cys 
1495 



TAACAGTCAG AACCACTAAT GGGTGTCATC TTGGCCATCC TAAACATCCA TCCAATTTCC 4927 
TAAAAACATT TTCCCTT 4944 



(2) INFORMATION FOR SEQ ID NO: 7: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1497 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

Met Tyr He Arg Val Ser Tyr Asp Thr Lys Pro Asp Ser Leu Leu His 
15 10 15 

Leu Met Val Lys Asp Trp Gin Leu Glu Leu Pro Lys Leu Leu He Ser 
20 25 30 

Val His Gly Gly Leu Gin Asn Phe Glu Met Gin Pro Lys Leu Lys Gin 
35 40 45 

Val Phe Gly Lys Gly Leu He Lys Ala Ala Met Thr Thr Gly Ala Trp 
50 55 60 

He Phe Thr Gly Gly Val Ser Thr Gly Val He Ser His Val Gly Asp 
65 70 75 80 

Ala Leu Lys Asp His Ser Ser Lys Ser Arg Gly Arg Val Cys Ala He 
85 90 95 



Gly lie Ala Pro Trp Gly lie Val Glu Asn Lys Glu Asp Leu Val Gly 
100 105 110 

Lys Asp Val Thr Arg Val Tyr Gin Thr Met Ser Asn Pro Leu Ser Lys 
115 120 125 

Leu Ser Val Leu Asn Asn Ser His Thr His Phe lie Leu Ala Asp Asn 
130 135 140 

Gly Thr Leu Gly Lys Tyr Gly Ala Glu Val Lys Leu Arg Arg Leu Leu 
145 150 155 160 

Glu Lys His lie Ser Leu Gin Lys lie Asn Thr Arg Leu Gly Gin Gly 
165 170 175 

Val Pro Leu Val Gly Leu Val Val Glu Gly Gly Pro Asn Val Val Ser 
180 185 190 

lie Val Leu Glu Tyr Leu Gin Glu Glu Pro Pro lie Pro Val Val lie 
195 200 205 

Cys Asp Gly Ser Gly Arg Ala Ser Asp lie Leu Ser Phe Ala His Lys 
210 215 220 

Tyr Cys Glu Glu Gly Gly lie lie Asn Glu Ser Leu Arg Glu Gin Leu 
225 230 235 240 

Leu Val Thr lie Gin Lys Thr Phe Asn Tyr Asn Lys Ala Gin Ser His 
245 250 255 

Gin Leu Phe Ala He He Met Glu Cys Met Lys Lys Lys Glu Leu Val 
260 265 270 

Thr Val Phe Arg Met Gly Ser Glu Gly Gin Gin Asp He Glu Met Ala 
275 280 285 

He Leu Thr Ala Leu Leu Lys Gly Thr Asn Val Ser Ala Pro Asp Gin 
290 295 300 

Leu Ser Leu Ala Leu Ala Trp Asn Arg Val Asp He Ala Arg Ser Gin 
305 310 315 320 

He Phe Val Phe Gly Pro His Trp Thr Pro Leu Gly Ser Leu Ala Pro 
325 330 335 

Pro Thr Asp Ser Lys Ala Thr Glu Lys Glu Lys Lys Pro Pro Met Ala 
340 345 350 

Thr Thr Lys Gly Gly Arg Gly Lys Gly Lys Gly Lys Lys Lys Gly Lys 
355 360 365 

Val Lys Glu Glu Val Glu Glu Glu Thr Asp Pro Arg Lys He Glu Leu 
370 375 380 

Leu Asn Trp Val Asn Ala Leu Glu Gin Ala Met Leu Asp Ala Leu Val 
385 390 395 400 

Leu Asp Arg Val Asp Phe Val Lys Leu Leu He Glu Asn Gly Val Asn 
405 410 415 

Met Gin His Phe Leu Thr He Pro Arg Leu Glu Glu Leu Tyr Asn Thr 
420 425 430 



Arg Leu Gly Pro Pro Asn Thr Leu His Leu Leu Val Arg Asp Val Lys 
435 440 445 

Lys Ser Asn Leu Pro Pro Asp Tyr His lie Ser Leu lie Asp lie Gly 
450 455 460 

Leu Val Leu Glu Tyr Leu Met Gly Gly Ala Tyr Arg Cys Asn Tyr Thr 
465 470 475 480 

Arg Lys Asn Phe Arg Thr Leu Tyr Asn Asn Leu Phe Gly Pro Lys Arg 
485 490 495 

Pro Lys Ala Leu Lys Leu Leu Gly Met Glu Asp Asp Glu Pro Pro Ala 
500 505 510 

Lys Gly Lys Lys Lys Lys Lys Lys Lys Lys Glu Glu Glu lie Asp lie 
515 520 525 

Asp Val Asp Asp Pro Ala Val Ser Arg Phe Gin Tyr Pro Phe His Glu 
530 535 540 

Leu Met Val Trp Ala Val Leu Met Lys Arg Gin Lys Met Ala Val Phe 
545 550 555 560 

Leu Trp Gin Arg Gly Glu Glu Ser Met Ala Lys Ala Leu Val Ala Cys 
565 570 575 

Lys Leu Tyr Lys Ala Met Ala His Glu Ser Ser Glu Ser Asp Leu Val 
580 585 590 

Asp Asp lie Ser Gin Asp Leu Asp Asn Asn Ser Lys Asp Phe Gly Gin 
595 600 605 

Leu Ala Leu Glu Leu Leu Asp Gin Ser Tyr Lys His Asp Glu Gin lie 
610 615 620 

Ala Met Lys Leu Leu Thr Tyr Glu Leu Lys Asn Trp Ser Asn Ser Thr 
625 630 635 640 

Cys Leu Lys Leu Ala Val Ala Ala Lys His Arg Asp Phe lie Ala His 
645 650 655 

Thr Cys Ser Gin Met Leu Leu Thr Asp Met Trp Met Gly Arg Leu Arg 
660 665 670 

Met Arg Lys Asn Pro Gly Leu Lys Val lie Met Gly lie Leu Leu Pro 
675 680 685 

Pro Thr lie Leu Phe Leu Glu Phe Arg Thr Tyr Asp Asp Phe Ser Tyr 
690 695 700 

Gin Thr Ser Lys Glu Asn Glu Asp Gly Lys Glu Lys Glu Glu Glu Asn 
705 710 715 720 

Thr Asp Ala Asn Ala Asp Ala Gly Ser Arg Lys Gly Asp Glu Glu Asn 
725 730 735 

Glu His Lys Lys Gin Arg He He Pro He Gly Thr Lys He Cys Lys 
740 745 750 

Phe Tyr Asn Ala Pro He Val Lys Phe Trp Phe Tyr Thr He Ser Tyr 
755 760 765 



Leu Gly Tyr Leu Leu Leu Phe Asn Tyr Val lie Leu Val Arg Met Asp 
770 775 780 

Gly Trp Pro Ser Leu Gin Glu Trp lie Val lie Ser Tyr lie Val Ser 
785 790 795 800 

Leu Ala Leu Glu Lys lie Arg Glu lie Leu Met Ser Glu Pro Gly Lys 
805 810 815 

Leu Ser Gin Lys lie Lys Val Trp Leu Gin Glu Tyr Trp Asn lie Thr 
820 825 830 

Asp Leu Val Ala lie Ser Thr Phe Met lie Gly Ala Met Ala Thr Arg 
835 840 845 

Ser Val Met Met lie Gly Lys Met Met lie Asp Met Leu Tyr Phe Val 
850 855 860 

Val lie Met Leu Val Val Leu Met Ser Phe Gly Val Ala Arg Gin Ala 
865 870 875 880 

He Leu His Pro Glu Glu Lys Pro Ser Trp Lys Leu Ala Arg Asn He 
885 890 895 

Phe Tyr Met Pro Tyr Trp Met He Tyr Gly Glu Val Phe Ala Asp Gin 
900 905 910 

He Asp Leu Tyr Ala Met Glu He Asn Pro Pro Cys Gly Glu Asn Leu 
915 920 925 

Tyr Asp Glu Glu Gly Lys Arg Leu Pro Pro Cys He Pro Gly Ala Trp 
930 935 940 

Leu Thr Pro Ala Leu Met Ala Cys Tyr Leu Leu Val Ala Asn He Leu 
945 950 955 960 

Leu Val Asn Leu Leu He Ala Val Phe Asn Asn Thr Phe Phe Glu Val 
965 970 975 

Lys Ser He Ser Asn Gin Val Trp Lys Phe Gin Arg Tyr Gin Leu He 
980 985 990 

Met Thr Phe His Asp Arg Pro Val Leu Pro Pro Pro Met He He Leu 
995 1000 1005 

Ser His He Tyr He He He Met Arg Leu Ser Gly Arg Cys Arg Lys 
1010 1015 1020 

Lys Arg Glu Gly Asp Gin Glu Glu Arg Asp Arg Gly Leu Lys Leu Phe 
1025 1030 1035 1040 

Leu Ser Asp Glu Glu Leu Lys Arg Leu His Glu Phe Glu Glu Gin Cys 
1045 1050 1055 

Val Gin Glu His Phe Arg Glu Lys Glu Asp Glu Gin Gin Ser Ser Ser 
1060 1065 1070 

Asp Glu Arg He Arg Val Thr Ser Glu Arg Val Glu Asn Met Ser Met 
1075 1080 1085 

Arg Leu Glu Glu He Asn Glu Arg Glu Thr Phe Met Lys Thr Ser Leu 
1090 1095 1100 



Gin Thr Val Asp Leu Arg Leu Ala Gin Leu Glu Glu Leu Ser Asn Arg 
1105 1110 1115 1120 

Met Val Asn Ala Leu Glu Asn Leu Ala Gly lie Asp Arg Ser Asp Leu 
1125 1130 1135 

lie Gin Ala Arg Ser Arg Ala Ser Ser Glu Cys Glu Ala Thr Tyr Leu 
1140 1145 1150 

Leu Arg Gin Ser Ser lie Asn Ser Ala Asp Gly Tyr Ser Leu Tyr Arg 
1155 1160 1165 

Tyr His Phe Asn Gly Glu Glu Leu Leu Phe Glu Asp Thr Ser Leu Ser 
1170 1175 1180 

Thr Ser Pro Gly Thr Gly Val Arg Lys Lys Thr Cys Ser Phe Arg lie 
1185 1190 1195 1200 

Lys Glu Glu Lys Asp Val Lys Thr His Leu Val Pro Glu Cys Gin Asn 
1205 1210 1215 

Ser Leu His Leu Ser Leu Gly Thr Ser Thr Ser Ala Thr Pro Asp Gly 
1220 1225 1230 

Ser His Leu Ala Val Asp Asp Leu Lys Asn Ala Glu Glu Ser Lys Leu 
1235 1240 1245 

Gly Pro Asp lie Gly lie Ser Lys Glu Asp Asp Glu Arg Gin Thr Asp 
1250 1255 1260 

Ser Lys Lys Glu Glu Thr lie Ser Pro Ser Leu Asn Lys Thr Asp Val 
1265 1270 1275 1280 

lie His Gly Gin Asp Lys Ser Asp Val Gin Asn Thr Gin Leu Thr Val 
1285 1290 1295 

Glu Thr Thr Asn lie Glu Gly Thr lie Ser Tyr Pro Leu Glu Glu Thr 
1300 1305 1310 

Lys lie Thr Arg Tyr Phe Pro Asp Glu Thr He Asn Ala Cys Lys Thr 
1315 1320 1325 

Met Lys Ser Arg Ser Phe Val Tyr Ser Arg Gly Arg Lys Leu Val Gly 
1330 1335 1340 

Gly Val Asn Gin Asp Val Glu Tyr Ser Ser He Thr Asp Gin Gin Leu 
1345 1350 1355 1360 

Thr Thr Glu Trp Gin Cys Gin Val Gin Lys He Thr Arg Ser His Ser 
1365 1370 1375 

Thr Asp He Pro Tyr He Val Ser Glu Ala Ala Val Gin Ala Glu Gin 
1380 1385 1390 

Lys Glu Gin Phe Ala Asp Met Gin Asp Glu His His Val Ala Glu Ala 
1395 1400 1405 

He Pro Arg lie Pro Arg Leu Ser Leu Thr He Thr Asp Arg Asn Gly 
1410 1415 1420 

Met Glu Asn Leu Leu Ser Val Lys Pro Asp Gin Thr Leu Gly Phe Pro 
1425 1430 1435 1440 



Ser Leu Arg Ser Lys Ser Leu His Gly His Pro Arg Asn Val Lys Ser 
1445 1450 1455 

lie Gin Gly Lys Leu Asp Arg Ser Gly His Ala Ser Ser Val Ser Ser 
1460 1465 1470 

Leu Val lie Val Ser Gly Met Thr Ala Glu Glu Lys Lys Val Lys Lys 
1475 1480 1485 

Glu Lys Ala Ser Thr Glu Thr Glu Cys 
1490 1495 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5055 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 346 ..4945 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

ACTCATTATA GGGNTCGAGC GGCCGCCCGG GCAGGTTTGA GCTGTGCCCT CTCCATTCCA 60 

CTGCTGTGGC AGGGTCAGAA ATCTTGGATA GAGAAAACCT TTTGCAAACG GGAATGTATC 120 

TTTGTAATTC CTAGCACGAA AGACTCTAAC AGGTGTTGCT GTGGCCAGTT CACCAACCAG 180 

CATATCCCCC CTCTGCCAAG TGCAACACCC AGCAAAAATG AAGAGGAAAG CAAACAGGTG 240 

GAGACTCAGC CTGAGAAATG GTCTGTTGCC AAGCACACCC AGAGCTACCC AACAGATTCC 300 

TATGGAGTTC TTGAATTCCA GGGTGGCGGA TATTCCAATA AAGCC ATG TAT ATC 354 

Met Tyr lie 
1 

CGT GTA TCC TAT GAC ACC AAG CCA GAC TCA CTG CTC CAT CTC ATG GTG 402 
Arg Val Ser Tyr Asp Thr Lys Pro Asp Ser Leu Leu His Leu Met Val 
5 10 15 

AAA GAT TGG CAG CTG GAA CTC CCC AAG CTC TTA ATA TCT GTG CAT GGA 450 
Lys Asp Trp Gin Leu Glu Leu Pro Lys Leu Leu lie Ser Val His Gly 
20 25 30 35 

GGC CTC CAG AAC TTT GAG ATG CAG CCC AAG CTG AAA CAA GTC TTT GGG 498 
Gly Leu Gin Asn Phe Glu Met Gin Pro Lys Leu Lys Gin Val Phe Gly 
40 45 50 

AAA GGC CTG ATC AAG GCT GCT ATG ACC ACC GGG GCC TGG ATC TTC ACC 546 
Lys Gly Leu lie Lys Ala Ala Met Thr Thr Gly Ala Trp lie Phe Thr 
55 60 65 

GGG GGT GTC AGC ACA GGT GTT ATC AGC CAC GTA GGG GAT GCC TTG AAA 594 
Gly Gly Val Ser Thr Gly Val lie Ser His Val Gly Asp Ala Leu Lys 
70 75 80 



GAC CAC TCC TCC AAG TCC AGA GGC CGG GTT TGT GCT ATA GGA ATT GCT 642 
Asp His Ser Ser Lys Ser Arg Gly Arg Val Cys Ala lie Gly lie Ala 
85 90 95 

CCA TGG GGC ATC GTG GAG AAT AAG GAA GAC CTG GTT GGA AAG GAT GTA 690 
Pro Trp Gly lie Val Glu Asn Lys Glu Asp Leu Val Gly Lys Asp Val 
100 105 110 115 

ACA AGA GTG TAC CAG ACC ATG TCC AAC CCT CTA AGT AAG CTC TCT GTG 738 
Thr Arg Val Tyr Gin Thr Met Ser Asn Pro Leu Ser Lys Leu Ser Val 
120 125 130 

CTC AAC AAC TCC CAC ACC CAC TTC ATC CTG GCT GAC AAT GGC ACC CTG 786 
Leu Asn Asn Ser His Thr His Phe lie Leu Ala Asp Asn Gly Thr Leu 
135 140 145 

GGC AAG TAT GGC GCC GAG GTG AAG CTG CGA AGG CTG CTG GAA AAG CAC 834 
Gly Lys Tyr Gly Ala Glu Val Lys Leu Arg Arg Leu Leu Glu Lys His 
150 155 160 

ATC TCC CTC CAG AAG ATC AAC ACA AGA CTG GGG CAG GGC GTG CCC CTC 882 
lie Ser Leu Gin Lys lie Asn Thr Arg Leu Gly Gin Gly Val Pro Leu 
165 170 175 

GTG GGT CTC GTG GTG GAG GGG GGC CCT AAC GTG GTG TCC ATC GTC TTG 930 
Val Gly Leu Val Val Glu Gly Gly Pro Asn Val Val Ser lie Val Leu 
180 185 190 195 

GAA TAC CTG CAA GAA GAG CCT CCC ATC CCT GTG GTG ATT TGT GAT GGC 978 
Glu Tyr Leu Gin Glu Glu Pro Pro lie Pro Val Val lie Cys Asp Gly 
200 205 210 

AGC GGA CGT GCC TCG GAC ATC CTG TCC TTT GCG CAC AAG TAC TGT GAA 1026 
Ser Gly Arg Ala Ser Asp He Leu Ser Phe Ala His Lys Tyr Cys Glu 
215 220 225 

GAA GGC GGA ATA ATA AAT GAG TCC CTC AGG GAG CAG CTT CTA GTT ACC 1074 
Glu Gly Gly He He Asn Glu Ser Leu Arg Glu Gin Leu Leu Val Thr 
230 235 240 

ATT CAG AAA ACA TTT AAT TAT AAT AAG GCA CAA TCA CAT CAG CTG TTT 1122 
He Gin Lys Thr Phe Asn Tyr Asn Lys Ala Gin Ser His Gin Leu Phe 
245 250 255 

GCA ATT ATA ATG GAG TGC ATG AAG AAG AAA GAA CTC GTC ACT GTG TTC 1170 
Ala He He Met Glu Cys Met Lys Lys Lys Glu Leu Val Thr Val Phe 
260 265 270 275 

AGA ATG GGT TCT GAG GGC CAG CAG GAC ATC GAG ATG GCA ATT TTA ACT 1218 
Arg Met Gly Ser Glu Gly Gin Gin Asp He Glu Met Ala He Leu Thr 
280 285 290 

GCC CTG CTG AAA GGA ACA AAC GTA TCT GCT CCA GAT CAG CTG AGC TTG 1266 
Ala Leu Leu Lys Gly Thr Asn Val Ser Ala Pro Asp Gin Leu Ser Leu 
295 300 305 

GCA CTG GCT TGG AAC CGC GTG GAC ATA GCA CGA AGC CAG ATC TTT GTC 1314 
Ala Leu Ala Trp Asn Arg Val Asp He Ala Arg Ser Gin He Phe Val 
310 315 320 

TTT GGG CCC CAC TGG ACG CCC CTG GGA AGC CTG GCA CCC CCG ACG GAC 1362 
Phe Gly Pro His Trp Thr Pro Leu Gly Ser Leu Ala Pro Pro Thr Asp 
325 330 335 



AGC AAA GCC ACG GAG AAG GAG AAG AAG CCA CCC ATG GCC ACC ACC AAG 1410 
Ser Lys Ala Thr Glu Lys Glu Lys Lys Pro Pro Met Ala Thr Thr Lys 
340 345 350 355 

GGA GGA AGA GGA AAA GGG AAA GGC AAG AAG AAA GGG AAA GTG AAA GAG 1458 
Gly Gly Arg Gly Lys Gly Lys Gly Lys Lys Lys Gly Lys Val Lys Glu 
360 365 370 

GAA GTG GAG GAA GAA ACT GAC CCC CGG AAG ATA GAG CTG CTG AAC TGG 1506 
Glu Val Glu Glu Glu Thr Asp Pro Arg Lys lie Glu Leu Leu Asn Trp 
375 380 385 

GTG AAT GCT TTG GAG CAA GCG ATG CTA GAT GCT TTA GTC TTA GAT CGT 1554 
Val Asn Ala Leu Glu Gin Ala Met Leu Asp Ala Leu Val Leu Asp Arg 
390 395 400 

GTC GAC TTT GTG AAG CTC CTG ATT GAA AAC GGA GTG AAC ATG CAA CAC 1602 
Val Asp Phe Val Lys Leu Leu lie Glu Asn Gly Val Asn Met Gin His 
405 410 415 

TTT CTG ACC ATT CCG AGG CTG GAG GAG CTT TAT AAC ACA AGA CTG GGT 1650 
Phe Leu Thr lie Pro Arg Leu Glu Glu Leu Tyr Asn Thr Arg Leu Gly 
420 425 430 435 

CCA CCA AAC ACA CTT CAT CTG CTG GTG AGG GAT GTG AAA AAG AGC AAC 1698 
Pro Pro Asn Thr Leu His Leu Leu Val Arg Asp Val Lys Lys Ser Asn 
440 445 450 

CTT CCG CCT GAT TAC CAC ATC AGC CTC ATA GAC ATC GGG CTC GTG CTG 1746 
Leu Pro Pro Asp Tyr His lie Ser Leu lie Asp He Gly Leu Val Leu 
455 460 465 

GAG TAC CTC ATG GGA GGA GCC TAC CGC TGC AAC TAC ACT CGG AAA AAC 1794 
Glu Tyr Leu Met Gly Gly Ala Tyr Arg Cys Asn Tyr Thr Arg Lys Asn 
470 475 480 

TTT CGG ACC CTT TAC AAC AAC TTG TTT GGA CCA AAG AGG CCT AAA GCT 1842 
Phe Arg Thr Leu Tyr Asn Asn Leu Phe Gly Pro Lys Arg Pro Lys Ala 
485 490 495 

CTT AAA CTT CTG GGA ATG GAA GAT GAT GAG CCT CCA GCT AAA GGG AAG 1890 
Leu Lys Leu Leu Gly Met Glu Asp Asp Glu Pro Pro Ala Lys Gly Lys 
500 505 510 515 

AAA AAA AAA AAA AAG AAA AAG GAG GAA GAG ATC GAC ATT GAT GTG GAC 1938 
Lys Lys Lys Lys Lys Lys Lys Glu Glu Glu He Asp He Asp Val Asp 
520 525 530 

GAC CCT GCC GTG AGT CGG TTC CAG TAT CCC TTC CAC GAG CTG ATG GTG 1986 
Asp Pro Ala Val Ser Arg Phe Gin Tyr Pro Phe His Glu Leu Met Val 
535 540 545 

TGG GCA GTG CTG ATG AAA CGC CAG AAA ATG GCA GTG TTC CTC TGG CAG 2034 
Trp Ala Val Leu Met Lys Arg Gin Lys Met Ala Val Phe Leu Trp Gin 
550 555 560 

CGA GGG GAA GAG AGC ATG GCC AAG GCC CTG GTG GCC TGC AAG CTC TAC 2082 
Arg Gly Glu Glu Ser Met Ala Lys Ala Leu Val Ala Cys Lys Leu Tyr 
565 570 575 

AAG GCC ATG GCC CAC GAG TCC TCC GAG AGT GAT CTG GTG GAT GAC ATC 2130 
Lys Ala Met Ala His Glu Ser Ser Glu Ser Asp Leu Val Asp Asp He 
580 585 590 595 



TCC CAG GAC TTG GAT AAC AAT TCC AAA GAG TTC GGC CAG CTT GCT TTG 2178 
Ser Gin Asp Leu Asp Asn Asn Ser Lys Asp Phe Gly Gin Leu Ala Leu 
600 605 610 

GAG TTA TTA GAC CAG TCC TAT AAG CAT GAC GAG CAG ATC GCT ATG AAA 2226 
Glu Leu Leu Asp Gin Ser Tyr Lys His Asp Glu Gin lie Ala Met Lys 
615 620 625 

CTC CTG ACC TAC GAG CTG AAA AAC TGG AGC AAC TCG ACC TGC CTC AAA 2274 
Leu Leu Thr Tyr Glu Leu Lys Asn Trp Ser Asn Ser Thr Cys Leu Lys 
630 635 640 

CTG GCC GTG GCA GCC AAA CAC CGG GAC TTC ATT GCT CAC ACC TGC AGC 2322 
Leu Ala Val Ala Ala Lys His Arg Asp Phe He Ala His Thr Cys Ser 
645 650 655 

CAG ATG CTG CTG ACC GAT ATG TGG ATG GGA AGA CTG CGG ATG CGG AAG 2370 
Gin Met Leu Leu Thr Asp Met Trp Met Gly Arg Leu Arg Met Arg Lys 
660 665 670 675 

AAC CCC GGC CTG AAG GTT ATC ATG GGG ATT CTT CTA CCC CCC ACC ATC 2418 
Asn Pro Gly Leu Lys Val He Met Gly He Leu Leu Pro Pro Thr He 
680 685 690 

TTG TTT TTG GAA TTT CGC ACA TAT GAT GAT TTC TCG TAT CAA ACA TCC 2466 
Leu Phe Leu Glu Phe Arg Thr Tyr Asp Asp Phe Ser Tyr Gin Thr Ser 
695 700 705 

AAG GAA AAC GAG GAT GGC AAA GAA AAA GAA GAG GAA AAT ACG GAT GCA 2514 
Lys Glu Asn Glu Asp Gly Lys Glu Lys Glu Glu Glu Asn Thr Asp Ala 
710 715 720 

AAT GCA GAT GCT GGC TCA AGA AAG GGG GAT GAG GAG AAC GAG CAT AAA 2562 
Asn Ala Asp Ala Gly Ser Arg Lys Gly Asp Glu Glu Asn Glu His Lys 
725 730 735 

AAA CAG AGA AGT ATT CCC ATC GGA ACA AAG ATC TGT GAA TTC TAT AAC 2610 
Lys Gin Arg Ser He Pro He Gly Thr Lys He Cys Glu Phe Tyr Asn 
740 745 750 755 

GCG CCC ATT GTC AAG TTC TGG TTT TAC ACA ATA TCA TAC TTG GGC TAC 2658 
Ala Pro He Val Lys Phe Trp Phe Tyr Thr He Ser Tyr Leu Gly Tyr 
760 765 770 

CTG CTG CTG TTT AAC TAC GTC ATC CTG GTG CGG ATG GAT GGC TGG CCG 2706 
Leu Leu Leu Phe Asn Tyr Val He Leu Val Arg Met Asp Gly Trp Pro 
775 780 785 

TCC CTC CAG GAG TGG ATC GTC ATC TCC TAC ATC GTG AGC CTG GCG TTA 2754 
Ser Leu Gin Glu Trp He Val He Ser Tyr He Val Ser Leu Ala Leu 
790 795 800 

GAG AAG ATA CGA GAG ATC CTC ATG TCA GAA CCA GGC AAA CTC AGC CAG 2802 
Glu Lys He Arg Glu He Leu Met Ser Glu Pro Gly Lys Leu Ser Gin 
805 810 815 

AAA ATC AAA GTT TGG CTT CAG GAG TAC TGG AAC ATC ACA GAT CTC GTG 2850 
Lys He Lys Val Trp Leu Gin Glu Tyr Trp Asn He Thr Asp Leu Val 
820 825 830 835 

GCC ATT TCC ACA TTC ATG ATT GGA GCA ATT CTT CGC CTA CAG AAC CAG 2898 
Ala He Ser Thr Phe Met He Gly Ala He Leu Arg Leu Gin Asn Gin 
840 845 850 



CCC TAC ATG GGC TAT GGC CGG GTG ATC TAC TGT GTG GAT ATC ATC TTC 
Pro Tyr Met Gly Tyr Gly Arg Val lie Tyr Cys Val Asp lie lie Phe 
855 860 865 



2946 



TGG TAC ATC CGT GTC CTG GAC ATC TTT GGT GTC AAC AAG TAT CTG GGG 
Trp Tyr He Arg Val Leu Asp He Phe Gly Val Asn Lys Tyr Leu Gly 
870 875 880 

CCA TAC GTG ATG ATG ATT GGA AAG ATG ATG ATC GAC ATG CTG TAC TTT 
Pro Tyr Val Met Met He Gly Lys Met Met He Asp Met Leu Tyr Phe 
885 890 895 

GTG GTC ATC ATG CTG GTC GTG CTC ATG AGT TTC GGA GTA GCC CGT CAA 
Val Val He Met Leu Val Val Leu Met Ser Phe Gly Val Ala Arg Gin 
900 905 910 915 

GCC ATT CTG CAT CCA GAG GAG AAG CCC TCT TGG AAA CTG GCC CGA AAC 
Ala He Leu His Pro Glu Glu Lys Pro Ser Trp Lys Leu Ala Arg Asn 
920 925 930 

ATC TTC TAC ATG CCC TAC TGG ATG ATC TAT GGA GAG GTG TTT GCA GAC 
He Phe Tyr Met Pro Tyr Trp Met He Tyr Gly Glu Val Phe Ala Asp 
935 940 945 

CAG ATA GAC CTC TAC GCC ATG GAA ATT AAT CCT CCT TGT GGT GAG AAC 
Gin He Asp Leu Tyr Ala Met Glu He Asn Pro Pro Cys Gly Glu Asn 
950 955 960 

CTA TAT GAT GAG GAG GGC AAG CGG CTT CCT CCC TGT ATC CCC GGC GCC 
Leu Tyr Asp Glu Glu Gly Lys Arg Leu Pro Pro Cys He Pro Gly Ala 
965 970 975 

TGG CTC ACT CCA GCA CTC ATG GCG TGC TAT CTA CTG GTC GCC AAC ATC 
Trp Leu Thr Pro Ala Leu Met Ala Cys Tyr Leu Leu Val Ala Asn He 
980 985 990 995 

CTG CTG GTG AAC CTG CTG ATT GCT GTG TTC AAC AAT ACT TTC TTT GAA 
Leu Leu Val Asn Leu Leu He Ala Val Phe Asn Asn Thr Phe Phe Glu 
1000 1005 1010 

GTA AAA TCA ATA TCC AAC CAG GTG TGG AAG TTC CAG CGA TAT CAG CTG 
Val Lys Ser He Ser Asn Gin Val Trp Lys Phe Gin Arg Tyr Gin Leu 
1015 1020 1025 

ATT ATG ACA TTT CAT GAC AGG CCA GTC CTG CCC CCA CCG ATG ATC ATT 
He Met Thr Phe His Asp Arg Pro Val Leu Pro Pro Pro Met He He 
1030 1035 1040 

TTA AGC CAC ATC TAC ATC ATC ATT ATG CGT CTC AGC GGC CGC TGC AGG 
Leu Ser His He Tyr He He He Met Arg Leu Ser Gly Arg Cys Arg 
1045 1050 1055 



2994 



3042 



3090 



3138 



3186 



3234 



3282 



3330 



3378 



3426 



3474 



3522 



AAA AAG AGA GAA GGG GAC CAA GAG GAA CGG GAT CGT GGA TTG AAG CTC 
Lvs Lys Arg Glu Gly Asp Gin Glu Glu Arg Asp Arg Gly Leu Lys Leu 
1060 1065 1070 



3570 



1075 



TTC CTT AGC GAC GAG GAG CTA AAG AGG CTG CAT GAG TTC GAG GAG CAG 
Phe Leu Ser Asp Glu Glu Leu Lys Arg Leu His Glu Phe Glu Glu Gin 
1080 1085 1090 

TGC GTG CAG GAG CAC TTC CGG GAG AAG GAG GAT GAG CAG CAG TCG TCC 
Cys Val Gin Glu His Phe Arg Glu Lys Glu Asp Glu Gin Gin Ser Ser 
1095 1100 1105 



3618 



3666 



AGC GAC GAG CGC ATC CGG GTC ACT TCT GAA AGA GTT GAA AAT ATG TCA 3714 
Ser Asp Glu Arg He Arg Val Thr Ser Glu Arg Val Glu Asn Met Ser 
1110 1115 1120 

ATG AGG TTG GAA GAA ATC AAT GAA AGA GAA ACT TTT ATG AAA ACT TCC 3762 
Met Arg Leu Glu Glu He Asn Glu Arg Glu Thr Phe Met Lys Thr Ser 
1125 1130 1135 

CTG CAG ACT GTT GAC CTT CGA CTT GCT CAG CTA GAA GAA TTA TCT AAC 3810 
Leu Gin Thr Val Asp Leu Arg Leu Ala Gin Leu Glu Glu Leu Ser Asn 
1140 H45 1150 1155 

AGA ATG GTG AAT GCT CTT GAA AAT CTT GCG GGA ATC GAC AGG TCT GAC 3858 
Arg Met Val Asn Ala Leu Glu Asn Leu Ala Gly He Asp Arg Ser Asp 
1160 1165 1170 

CTG ATC CAG GCA CGG TCC CGG GCT TCT TCT GAA TGT GAG GCA ACG TAT 3906 
Leu He Gin Ala Arg Ser Arg Ala Ser Ser Glu Cys Glu Ala Thr Tyr 
1175 1180 1185 

CTT CTC CGG CAA AGC AGC ATC AAT AGC GCT GAT GGC TAC AGC TTG TAT 3954 
Leu Leu Arg Gin Ser Ser He Asn Ser Ala Asp Gly Tyr Ser Leu Tyr 
1190 1195 1200 

CGA TAT CAT TTT AAC GGA GAA GAG TTA TTA TTT GAG GAT ACA TCT CTC 4002 
Arg Tyr His Phe Asn Gly Glu Glu Leu Leu Phe Glu Asp Thr Ser Leu 
1205 1210 1215 

TCC ACG TCA CCA GGG ACA GGA GTC AGG AAA AAA ACC TGT TCC TTC CGT 4050 
Ser Thr Ser Pro Gly Thr Gly Val Arg Lys Lys Thr Cys Ser Phe Arg 
1220 1225 1230 1235 

ATA AAG GAA GAG AAG GAC GTG AAA ACG CAC CTA GTC CCA GAA TGT CAG 4098 
He Lys Glu Glu Lys Asp Val Lys Thr His Leu Val Pro Glu Cys Gin 
1240 1245 1250 

AAC AGT CTT CAC CTT TCA CTG GGC ACA AGC ACA TCA GCA ACC CCA GAT 4146 
Asn Ser Leu His Leu Ser Leu Gly Thr Ser Thr Ser Ala Thr Pro Asp 
1255 1260 1265 

GGC AGT CAC CTT GCA GTA GAT GAC TTA AAG AAC GCT GAA GAG TCA AAA 4194 
Gly Ser His Leu Ala Val Asp Asp Leu Lys Asn Ala Glu Glu Ser Lys 
1270 1275 1280 

TTA GGT CCA GAT ATT GGG ATT TCA AAG GAA GAT GAT GAA AGA CAG ACA 4242 
Leu Gly Pro Asp He Gly He Ser Lys Glu Asp Asp Glu Arg Gin Thr 
1285 1290 1295 

GAC TCT AAA AAA GAA GAA ACT ATT TCC CCA AGT TTA AAT AAA ACA GAT 4290 
Asp Ser Lys Lys Glu Glu Thr He Ser Pro Ser Leu Asn Lys Thr Asp 
1300 1305 1310 1315 

GTG ATA CAT GGA CAG GAC AAA TCA GAT GTT CAA AAC ACT CAG CTA ACA 4338 
Val He His Gly Gin Asp Lys Ser Asp Val Gin Asn Thr Gin Leu Thr 
1320 1325 1330 

GTG GAA ACG ACA AAT ATA GAA GGC ACT ATT TCC TAT CCC CTG GAA GAA 4386 
Val Glu Thr Thr Asn He Glu Gly Thr He Ser Tyr Pro Leu Glu Glu 
1335 1340 1345 

ACC AAA ATT ACA CGC TAT TTC CCC GAT GAA ACG ATC AAT GCT TGT AAA 4434 
Thr Lys He Thr Arg Tyr Phe Pro Asp Glu Thr He Asn Ala Cys Lys 
1350 1355 1360 



ACA ATG AAG TCC AGA AGC TTC GTC TAT TCC CGG GGA AGA AAG CTG GTC 4482 
Thr Met Lys Ser Arg Ser Phe Val Tyr Ser Arg Gly Arg Lys Leu Val 
1365 1370 1375 

GGT GGG GTT AAC CAG GAT GTA GAG TAC AGT TCA ATC ACG GAC CAG CAA 4530 
Gly Gly Val Asn Gin Asp Val Glu Tyr Ser Ser lie Thr Asp Gin Gin 
1380 1385 1390 1395 

TTG ACG ACG GAA TGG CAA TGC CAA GTT CAA AAG ATC ACG CGC TCT CAT 4578 
Leu Thr Thr Glu Trp Gin Cys Gin Val Gin Lys He Thr Arg Ser His 
1400 1405 1410 

AGC ACA GAT ATT CCT TAC ATT GTG TCG GAA GCT GCA GTG CAA GCT GAG 4626 
Ser Thr Asp He Pro Tyr He Val Ser Glu Ala Ala Val Gin Ala Glu 
1415 1420 1425 

CAA AAA GAG CAG TTT GCA GAT ATG CAA GAT GAA CAC CAT GTC GCT GAA 4674 
Gin Lys Glu Gin Phe Ala Asp Met Gin Asp Glu His His Val Ala Glu 
1430 1435 1440 

GCA ATT CCT CGA ATC CCT CGC TTG TCC CTA ACC ATT ACT GAC AGA AAT 4722 
Ala He Pro Arg He Pro Arg Leu Ser Leu Thr He Thr Asp Arg Asn 
1445 1450 1455 

GGG ATG GAA AAC TTA CTG TCT GTG AAG CCA GAT CAA ACT TTG GGA TTC 4770 
Gly Met Glu Asn Leu Leu Ser Val Lys Pro Asp Gin Thr Leu Gly Phe 
1460 1465 1470 1475 

CCA TCT CTC AGG TCA AAA AGT TTA CAT GGA CAT CCT AGG AAT GTG AAA 4818 
Pro Ser Leu Arg Ser Lys Ser Leu His Gly His Pro Arg Asn Val Lys 
1480 1485 1490 

TCC ATT CAG GGA AAG TTA GAC AGA TCT GGA CAT GCC AGT AGT GTA AGC 4866 
Ser He Gin Gly Lys Leu Asp Arg Ser Gly His Ala Ser Ser Val Ser 
1495 1500 1505 

AGC TTA GTA ATT GTG TCT GGA ATG ACA GCA GAA GAA AAA AAG GTT AAG 4914 
Ser Leu Val He Val Ser Gly Met Thr Ala Glu Glu Lys Lys Val Lys 
1510 1515 1520 

AAA GAG AAA GCT TCC ACA GAA ACT GAA TGC T AGTCTGTTTT GTTTCTTTAA 4965 
Lys Glu Lys Ala Ser Thr Glu Thr Glu Cys 
1525 1530 

TTTTTTTTTT TAACAGTCAG AAACCCACTA ATGGGTGTCA TCTTGGCCCA TCCTAAACAC 5025 
ATMTCCAATT TCCTAAAAAC ATTTTCCCTT 5055 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1533 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

Met Tyr He Arg Val Ser Tyr Asp Thr Lys Pro Asp Ser Leu Leu His 
15 10 15 



Leu Met Val Lys Asp Trp Gin Leu Glu Leu Pro Lys Leu Leu He Ser 
20 25 30 



Val His Gly Gly Leu Gin Asn Phe Glu Met Gin Pro Lys Leu Lys Gin 
35 40 45 

Val Phe Gly Lys Gly Leu lie Lys Ala Ala Met Thr Thr Gly Ala Trp 
50 55 60 

He Phe Thr Gly Gly Val Ser Thr Gly Val He Ser His Val Gly Asp 
65 70 75 80 

Ala Leu Lys Asp His Ser Ser Lys Ser Arg Gly Arg Val Cys Ala He 
85 90 95 

Gly He Ala Pro Trp Gly He Val Glu Asn Lys Glu Asp Leu Val Gly 
100 105 110 

Lvs Asp Val Thr Arg Val Tyr Gin Thr Met Ser Asn Pro Leu Ser Lys 
115 120 125 

Leu Ser Val Leu Asn Asn Ser His Thr His Phe He Leu Ala Asp Asn 
130 135 140 

Gly Thr Leu Gly Lys Tyr Gly Ala Glu Val Lys Leu Arg Arg Leu Leu 
145 150 155 160 

Glu Lys His He Ser Leu Gin Lys He Asn Thr Arg Leu Gly Gin Gly 
165 170 175 

Val Pro Leu Val Gly Leu Val Val Glu Gly Gly Pro Asn Val Val Ser 
180 185 190 

He Val Leu Glu Tyr Leu Gin Glu Glu Pro Pro He Pro Val Val He 
195 200 205 

Cys Asp Gly Ser Gly Arg Ala Ser Asp He Leu Ser Phe Ala His Lys 
210 215 220 

Tyr Cys Glu Glu Gly Gly He He Asn Glu Ser Leu Arg Glu Gin Leu 
225 230 235 240 

Leu Val Thr He Gin Lys Thr Phe Asn Tyr Asn Lys Ala Gin Ser His 
245 250 255 

Gin Leu Phe Ala He He Met Glu Cys Met Lys Lys Lys Glu Leu Val 
260 265 270 

Thr Val Phe Arg Met Gly Ser Glu Gly Gin Gin Asp He Glu Met Ala 
275 280 285 

He Leu Thr Ala Leu Leu Lys Gly Thr Asn Val Ser Ala Pro Asp Gin 
290 295 300 

Leu Ser Leu Ala Leu Ala Trp Asn Arg Val Asp He Ala Arg Ser Gin 
305 310 315 320 

He Phe Val Phe Gly Pro His Trp Thr Pro Leu Gly Ser Leu Ala Pro 
325 330 335 

Pro Thr Asp Ser Lys Ala Thr Glu Lys Glu Lys Lys Pro Pro Met Ala 
340 345 350 

Thr Thr Lys Gly Gly Arg Gly Lys Gly Lys Gly Lys Lys Lys Gly Lys 
355 360 365 



Val Lys Glu Glu Val Glu Glu Glu Thr Asp Pro Arg Lys lie Glu Leu 
370 375 380 

Leu Asn Trp Val Asn Ala Leu Glu Gin Ala Met Leu Asp Ala Leu Val 
385 390 395 400 

Leu Asp Arg Val Asp Phe Val Lys Leu Leu lie Glu Asn Gly Val Asn 
405 410 415 

Met Gin His Phe Leu Thr lie Pro Arg Leu Glu Glu Leu Tyr Asn Thr 
420 425 430 

Arg Leu Gly Pro Pro Asn Thr Leu His Leu Leu Val Arg Asp Val Lys 
435 440 445 

Lys Ser Asn Leu Pro Pro Asp Tyr His He Ser Leu He Asp He Gly 
450 455 460 

Leu Val Leu Glu Tyr Leu Met Gly Gly Ala Tyr Arg Cys Asn Tyr Thr 
465 470 475 480 

Arg Lys Asn Phe Arg Thr Leu Tyr Asn Asn Leu Phe Gly Pro Lys Arg 
485 490 495 

Pro Lys Ala Leu Lys Leu Leu Gly Met Glu Asp Asp Glu Pro Pro Ala 
500 505 510 

Lys Gly Lys Lys Lys Lys Lys Lys Lys Lys Glu Glu Glu He Asp He 
515 520 525 

Asp Val Asp Asp Pro Ala Val Ser Arg Phe Gin Tyr Pro Phe His Glu 
530 535 540 

Leu Met Val Trp Ala Val Leu Met Lys Arg Gin Lys Met Ala Val Phe 
545 550 555 560 

Leu Trp Gin Arg Gly Glu Glu Ser Met Ala Lys Ala Leu Val Ala Cys 
565 570 575 

Lys Leu Tyr Lys Ala Met Ala His Glu Ser Ser Glu Ser Asp Leu Val 
580 585 590 

Asp Asp He Ser Gin Asp Leu Asp Asn Asn Ser Lys Asp Phe Gly Gin 
595 600 605 

Leu Ala Leu Glu Leu Leu Asp Gin Ser Tyr Lys His Asp Glu Gin He 
610 615 620 

Ala Met Lys Leu Leu Thr Tyr Glu Leu Lys Asn Trp Ser Asn Ser Thr 
625 630 635 640 

Cys Leu Lys Leu Ala Val Ala Ala Lys His Arg Asp Phe He Ala His 
645 650 655 

Thr Cys Ser Gin Met Leu Leu Thr Asp Met Trp Met Gly Arg Leu Arg 
660 665 670 

Met Arg Lys Asn Pro Gly Leu Lys Val He Met Gly He Leu Leu Pro 
675 680 685 

Pro Thr lie Leu Phe Leu Glu Phe Arg Thr Tyr Asp Asp Phe Ser Tyr 
690 695 700 



Gin Thr Ser Lys Glu Asn Glu Asp Gly Lys Glu Lys Glu Glu Glu Asn 
705 710 715 720 

Thr Asp Ala Asn Ala Asp Ala Gly Ser Arg Lys Gly Asp Glu Glu Asn 
725 730 735 

Glu His Lys Lys Gin Arg Ser He Pro He Gly Thr Lys He Cys Glu 
740 745 750 

Phe Tyr Asn Ala Pro He Val Lys Phe Trp Phe Tyr Thr He Ser Tyr 
755 760 765 

Leu Gly Tyr Leu Leu Leu Phe Asn Tyr Val He Leu Val Arg Met Asp 
770 775 780 

Gly Trp Pro Ser Leu Gin Glu Trp He Val He Ser Tyr He Val Ser 
785 790 795 800 

Leu Ala Leu Glu Lys He Arg Glu He Leu Met Ser Glu Pro Gly Lys 
805 810 815 

Leu Ser Gin Lys He Lys Val Trp Leu Gin Glu Tyr Trp Asn He Thr 
820 825 830 

Asp Leu Val Ala He Ser Thr Phe Met He Gly Ala He Leu Arg Leu 
835 840 845 

Gin Asn Gin Pro Tyr Met Gly Tyr Gly Arg Val He Tyr Cys Val Asp 
850 855 860 

He He Phe Trp Tyr He Arg Val Leu Asp He Phe Gly Val Asn Lys 
865 870 875 880 

Tyr Leu Gly Pro Tyr Val Met Met He Gly Lys Met Met He Asp Met 
885 890 895 

Leu Tyr Phe Val Val He Met Leu Val Val Leu Met Ser Phe Gly Val 
900 905 910 

Ala Arg Gin Ala He Leu His Pro Glu Glu Lys Pro Ser Trp Lys Leu 
915 920 925 

Ala Arg Asn He Phe Tyr Met Pro Tyr Trp Met He Tyr Gly Glu Val 
930 935 940 

Phe Ala Asp Gin He Asp Leu Tyr Ala Met Glu He Asn Pro Pro Cys 
945 950 955 960 

Gly Glu Asn Leu Tyr Asp Glu Glu Gly Lys Arg Leu Pro Pro Cys He 
965 970 975 

Pro Gly Ala Trp Leu Thr Pro Ala Leu Met Ala Cys Tyr Leu Leu Val 
980 985 990 

Ala Asn He Leu Leu Val Asn Leu Leu He Ala Val Phe Asn Asn Thr 
995 1000 1005 

Phe Phe Glu Val Lys Ser He Ser Asn Gin Val Trp Lys Phe Gin Arg 
1010 1015 1020 

Tyr Gin Leu He Met Thr Phe His Asp Arg Pro Val Leu Pro Pro Pro 
1025 1030 1035 1040 



Met lie lie Leu Ser His He Tyr He He He Met Arg Leu Ser Gly 
1045 1050 1055 

Arg Cys Arg Lys Lys Arg Glu Gly Asp Gin Glu Glu Arg Asp Arg Gly 
1060 1065 1070 

Leu Lys Leu Phe Leu Ser Asp Glu Glu Leu Lys Arg Leu His Glu Phe 
1075 1080 1085 

Glu Glu Gin Cys Val Gin Glu His Phe Arg Glu Lys Glu Asp Glu Gin 
1090 1095 1100 

Gin Ser Ser Ser Asp Glu Arg He Arg Val Thr Ser Glu Arg Val Glu 
1105 1110 1115 1120 

Asn Met Ser Met Arg Leu Glu Glu He Asn Glu Arg Glu Thr Phe Met 
1125 1130 1135 

Lys Thr Ser Leu Gin Thr Val Asp Leu Arg Leu Ala Gin Leu Glu Glu 
1140 1145 1150 

Leu Ser Asn Arg Met Val Asn Ala Leu Glu Asn Leu Ala Gly He Asp 
1155 1160 1165 

Arg Ser Asp Leu He Gin Ala Arg Ser Arg Ala Ser Ser Glu Cys Glu 
1170 1175 1180 

Ala Thr Tyr Leu Leu Arg Gin Ser Ser He Asn Ser Ala Asp Gly Tyr 
1185 1190 1195 1200 

Ser Leu Tyr Arg Tyr His Phe Asn Gly Glu Glu Leu Leu Phe Glu Asp 
1205 1210 1215 

Thr Ser Leu Ser Thr Ser Pro Gly Thr Gly Val Arg Lys Lys Thr Cys 
1220 1225 1230 

Ser Phe Arg He Lys Glu Glu Lys Asp Val Lys Thr His Leu Val Pro 
1235 1240 1245 

Glu Cys Gin Asn Ser Leu His Leu Ser Leu Gly Thr Ser Thr Ser Ala 
1250 1255 1260 

Thr Pro Asp Gly Ser His Leu Ala Val Asp Asp Leu Lys Asn Ala Glu 
1265 1270 1275 1280 

Glu Ser Lys Leu Gly Pro Asp He Gly He Ser Lys Glu Asp Asp Glu 
1285 1290 1295 

Arg Gin Thr Asp Ser Lys Lys Glu Glu Thr He Ser Pro Ser Leu Asn 
1300 1305 1310 

Lys Thr Asp Val He His Gly Gin Asp Lys Ser Asp Val Gin Asn Thr 
1315 1320 1325 

Gin Leu Thr Val Glu Thr Thr Asn He Glu Gly Thr He Ser Tyr Pro 
1330 1335 1340 

Leu Glu Glu Thr Lys He Thr Arg Tyr Phe Pro Asp Glu Thr He Asn 
1345 1350 1355 1360 

Ala Cys Lys Thr Met Lys Ser Arg Ser Phe Val Tyr Ser Arg Gly Arg 
1365 1370 1375 



Lys Leu Val Gly Gly Val Asn Gin Asp Val Glu Tyr Ser Ser lie Thr 
1380 1385 1390 

Asp Gin Gin Leu Thr Thr Glu Trp Gin Cys Gin Val Gin Lys lie Thr 
1395 1400 1405 

Arg Ser His Ser Thr Asp lie Pro Tyr lie Val Ser Glu Ala Ala Val 
1410 1415 1420 

Gin Ala Glu Gin Lys Glu Gin Phe Ala Asp Met Gin Asp Glu His His 
1425 1430 1435 1440 

Val Ala Glu Ala lie Pro Arg lie Pro Arg Leu Ser Leu Thr lie Thr 
1445 1450 1455 

Asp Arg Asn Gly Met Glu Asn Leu Leu Ser Val Lys Pro Asp Gin Thr 
1460 1465 1470 

Leu Gly Phe Pro Ser Leu Arg Ser Lys Ser Leu His Gly His Pro Arg 
1475 1480 1485 

Asn Val Lys Ser lie Gin Gly Lys Leu Asp Arg Ser Gly His Ala Ser 
1490 1495 1500 

Ser Val Ser Ser Leu Val lie Val Ser Gly Met Thr Ala Glu Glu Lys 
1505 1510 1515 1520 

Lys Val Lys Lys Glu Lys Ala Ser Thr Glu Thr Glu Cys 
1525 1530 
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PATENT 

ATTORNEY DOCKET NO: 07334/004001 

COMPOSITIONS AND METHODS FOR THE DIAGNOSIS, 
PREVENTION AND TREATMENT OF TUMOR PROGRESSION 

5 This is a continuation-in-part of U.S. Serial No. 

08/412,431. 

1. INTRODUCTION 
The present invention relates to methods and 
compositions for the diagnosis, prevention and treatment of 

10 tumor progression in mammals, for example, humans. The 

different types of tumors may include, but are not limited 
to, human melanomas, breast, gastrointestinal tumors such as 
esophageal, stomach, duodenal, colon, colorectal and rectal 
cancers, prostate, bladder, testicular, ovarian, uterine, 

15 cervical, brain, lung, bronchial, larynx, pharynx, liver, 

pancreatic, thyroid, bone, various types of skin cancers and 
neoplastic conditions such as leukemias and lymphomas. 
Specifically, genes which are differentially expressed in 
tumor cells relative to normal cells and/or relative to 

20 tumor cells at a different stage of tumor progression are 
identified. For example, genes are identified which are 
differentially expressed in benign ( e.g. , non-malignant) 
tumor cells relative to malignant tumor cells exhibiting a 
high metastatic potential. Genes are also identified via 

25 the ability of their gene products to interact with gene 
products involved in the progression to and/ or 
aggressiveness of neoplastic tumor disease states. The 
genes identified can be used diagnostically or as targets 
for therapeutic intervention. In this regard, the present 

3 0 invention provides methods for the identification of 
compounds useful in the diagnosis, prevention and 
therapeutic treatment of tumor progression, including, for 
example, metastatic neoplastic disorders. The present 
invention also provides methods for the identification of 

3 5 compounds useful in the diagnosis, prevention and 



therapeutic treatment of tumor progression, including, for 
example, pre-neoplastic and/or benign states. Additionally, 
methods are provided for the diagnostic evaluation and 
prognosis of conditions involving tumor progression, for the 
5 identification of subjects exhibiting a predisposition to 
such conditions, for monitoring patients undergoing clinical 
evaluation for the prevention and treatment of tumor 
progression disorders, and for monitoring the efficacy of 
compounds used in clinical trials. 

10 2. BACKGROUND OF THE INVENTION 

Cancer is the second leading cause of death in the 
United States, after heart disease (Boring, C.C. et al., 
1993, CA Cancer J. Clin. 41:7), and develops in one in three 
Americans, and one of every four Americans dies of cancer. 

15 Cancer is characterized primarily by an increase in the 
number of abnormal, or neoplastic, cells derived from a 
given normal tissue which proliferate to form a tumor mass, 
the invasion of adjacent tissues by these neoplastic tumor 
cells, and the generation of malignant cells which spread 

20 via the blood or lymphatic system to regional lymph nodes 

and to distant sites. The latter progression to malignancy 
is referred to as metastasis. 

Cancer can be viewed as a breakdown in the 
communication between tumor cells and their environment, 

25 including their normal neighboring cells. Signals, both 
growth-stimulatory and growth-inhibitory, are routinely 
exchanged between cells within a tissue. Normally, cells do 
not divide in the absence of stimulatory signals, and, 
likewise, will cease dividing in the presence of inhibitory 

30 signals. In a cancerous, or neoplastic, state, a cell 
acquires the ability to "override" these signals and to 
proliferate under conditions in which normal cells would not 
grow. 



Tumor cells must acquire a number of distinct 
aberrant traits to proliferate. Reflecting this requirement 
is the fact that the genomes of certain well-studied tumors 
carry several different independently altered genes, 
including activated oncogenes and inactivated tumor 
suppressor genes. Each of these genetic changes appears to 
be responsible for imparting some of the traits that, in 
aggregate, represent the full neoplastic phenotype (Land, H. 
et al., 1983, Science 222:771; Ruley, H.E., 1983, Nature 
304:602; Hunter, T. , 1991, Cell 64:249). 

In addition to unhindered cell proliferation, cells 
must acquire several traits for tumor progression to occur. 
For example, early on in tumor progression, cells must evade 
the host immune system. Further, as tumor mass increases, 
the tumor must acquire vasculature to supply nourishment and 
remove metabolic waste. Additionally, cells must acquire an 
ability to invade adjacent tissue, and, ultimately, cells 
often acquire the capacity to metastasize to distant sites. 

The biochemical basis for immune recognition of 
tumor cells is unclear. It is possible that the 
tumorigenicity .of cells can increase when the cells' display 
of Class I histocompatability antigens is reduced (Schrier, 
P.I. et al., 1983, Nature 3C£:771), in that these antigens, 
in conjunction with tumor-specific antigens are required for 
the tumor cells to be recognized by cytotoxic T lymphocytes 
(CTLs) . Tumor cells which have lost one or more genes 
encoding tumor-specific antigens seem to escape recognition 
by the corresponding reactive CTLs (Van der Bruggen, P. et 
al., 1991, Science 2M: 1643) . 

Once a tumor reaches more than about 1 mm in 
diameter, it can no longer rely on passive diffusion for 
nutrition and removal of metabolic waste. At this point, 
the tumor mass must make intimate contact with the 



circulatory system. Thus, cells within more advanced tumors 
secrete angiogenic factors which promote neovascularization, 
i.e. , the growth of blood vessels from surrounding tissue 
into the tumor mass (Folkman, J. and Klagsburn, M. , 1987, 
Science 235:442; Liotta, L. A. et al., 1991, Cell 64,:327). 
Among these angiogenic factors are the fibroblast growth 
factor (FGF) and endothelial cell growth factor (ECGF) . 
Neovascularization can, in fact, be an essential precursor 
to metastasis. First, the process is required for a large 
increase in tumor cell number, which in turn, allows the 
appearance of rare metastatic variants. Further, 
neovascularization provides a direct portal entry into the 
circulatory system which can be used by metastasizing cells. 

A variety of biochemical factors have been 
associated with different phases of metastases. Cell 
surface receptors for collagen, glycoproteins such as 
laminin, or proteoglycans, facilitate tumor cell attachment, 
an important step in invasion and metastases. Attachment 
then triggers the release of degradative enzymes which 
facilitate the penetration of tumor cells through tissue 
barriers. Once the tumor cell has entered the target 
tissue, specific growth factors are required for further 
proliferation. 

It is apparent that the complex process of tumor 
progression must involve multiple gene products. It is 
therefore important to define the role of specific genes 
involved in tumor progression, to identify those gene 
products involved in the tumor progression process and to 
further identify those gene products which can serve as 
therapeutic targets for the diagnosis, prevention and 
treatment of metastases of various forms of cancers. 

Some attempts have been made to study genes which 
are thought to elicit or augment tumor progression 



phenotypes. Mutations may drive a wave of cellular 
multiplication associated with gradual increases in tumor 
size, disorganization and malignancy. For example, a 
mutation in the tumor suppressor gene which is a negative 
regulator of cellular proliferation, results in a loss of 
crucial control over tumor growth and progression. 
Differential expression of the following suppressor genes 
has been demonstrated in human cancers: the retinoblastoma 
gene, RB; the Wilms' tumor gene, WT1 (lip) ; the gene deleted 
in colon carcinoma, DCC (18q) ; the neurofibromatosis type 1 
gene, NF1 (17q) ; and the gene involved in familial 
adenomatous polyposis coli, APC (5q) (Vogelstein, B. and 
Kinzler, K.W. , 1993, Trends Genet. 9:138-141). 

Insight into the complex events that lead from 
normal cellular growth to neoplasia, invasion and metastasis 
is crucial for the development of effective diagnostic and 
therapeutic strategies. The foregoing studies are aimed at 
defining the role of particular gene products presumed to be 
involved in tumor progression. However, such approaches 
cannot identify the full panoply of gene products that are 
involved in the cascade of steps in tumor progression. A 
great need, therefore, exists for the successful 
identification of those genes which are differentially 
expressed in cells involved in or predisposed to a tumor 
progression phenotype. Such differentially expressed gene 
and/or gene products can represent useful diagnostic markers 
and/or therapeutic targets for tumor progression disorders. 
With respect to diagnostic techniques, such genes and/or 
gene products could represent useful markers for the 
diagnosis, especially early diagnosis, given the correlation 
between early diagnosis and successful cancer treatment. 
With respect to therapeutic treatments, such differentially 
expressed genes and/ or gene products could represent useful 



targets for therapeutic treatment of various forms of tumor 
progression disorders, including metastatic and non- 
metastatic neoplastic disorders, and for inhibiting the 
progression of pre-neoplastic lesions (e.g. , hyperplastic 
5 lesions or other benign tumors) to malignant tumors. 

Differentially expressed genes involved in tumor 
metastasis have been identified using murine melanoma cell 
lines of varying metastatic potentials, N-nitroso- 
methylurea-induced rat mammary carcinomas, mammary carcinoma 

10 cell lines, human breast tumors and spontaneous colonic and 
intestinal tumors in mice (Steeg, P.S., et al., 1988, J, 
Natl. Cancer Inst. 80:200-204; Qian, F., et al., 1994, Cell 
77:335-347; Leone, A., et al., 1991, 65:25-35; Zou, Z., et 
al., 1994, Science 263 :526-529; and Fodde, R. , et al. , 1994, 

15 Proc. Natl. Acad. Sci. USA 91: 8969-8973) . 

3 . SUMMARY OF THE INVENTION 
The present invention relates to methods and 
compositions for diagnosis, prevention, and treatment of 
tumor progression. Specifically, murine and human genes are 

20 identified and described which are differentially expressed 
in tumor cells relative to normal cells and/or to tumor 
cells at a different stage of tumor progression. For 
example, genes are identified which are differentially 
expressed in benign ( e.g. . non-malignant) tumor cells 

25 relative to malignant, metastatic tumor cells. The 

modulation of the expression of the identified genes and/ or 
the activity of the identified gene products can be utilized 
therapeutically to treat disorders involving tumor 
progression, including, for example, metastatic disorders. 

30 As such, methods and compositions are described for the 
identification of novel therapeutic compounds for the 
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inhibition of tumor progression and the treatment of tumor 
progression disorders, including metastatic diseases. 

Further, the identified genes and/or gene products 
can be used to identify cells exhibiting or predisposed to a 
disorder involving a tumor progression phenotype, thereby 
diagnosing individuals having, or at high risk for 
developing, such disorders. Additionally, the identified 
genes and/ or gene products can be used to grade or stage 
identified tumor cells. Still further, the detection of the 
differential expression of identified genes can be used to 
devise treatments (for example, chemoprevention) before the 
benign cells attain a malignant state. Still further, the 
detection of differential expression of identified genes can 
be used to design a preventive intervention in pre- 
neoplastic cells in individuals at high risk. 

"Tumor progression," as used herein, refers to any 
event which, first, promotes the transition of a normal, 
non-neoplastic cell to a cancerous, neoplastic one. Such 
events include ones which occur prior to the onset of 
neoplasia, and which predispose, or act as a step toward, 
the cell becoming neoplastic. These events can, for 
example, include ones which cause a normal cell to exhibit a 
pre-neoplastic phenotype. Second, such events also include 
ones which bring about the transition from a pre-neoplastic 
state to a neoplastic one. Such events can, for example, 
include ones which promote two hallmarks of the neoplastic 
state, namely unhindered cell proliferation and/or tumor 
cell invasion of adjacent tissue. Third, tumor progression 
can include events which promote the transition of a tumor 
cell to a metastatic state. Within each state, (e.g. , pre- 
neoplastic, neoplastic and metastatic) the term "tumor 
progression" as used herein can also refer to the disorder 



severity or aggressiveness a cell exhibits relative to other 
cells within the same state. 

Because multiple tumor progression events occur as a 
cell progresses from normal to neoplastic and metastatic 
5 states, certain cells will have undergone a different set of 
such tumor progression events. As such, such cells are 
referred to herein as belonging to different "tumor 
progression stages." 

A "disorder involving tumor progression" or a "tumor 
10 progression disorder," as used herein, refers to the state 
of a cell or cells which have undergone or are in the 
process of undergoing a tumor progression event, as defined 
above . 

"Differential expression," as used herein, refers to 

15 both quantitative, as well as qualitative, differences in 
the genes' temporal and/or cellular expression patterns 
among, for example, normal and neoplastic tumor cells, 
and/or among tumor cells which have undergone different 
tumor progression events. Differentially expressed genes 

20 can represent "fingerprint genes," and/or "target genes." 

"Fingerprint gene," as used herein, refers to a 
differentially expressed gene whose expression pattern can 
be utilized as part of a prognostic or diagnostic marker for 
the evaluation of a disorder involving tumor progression, or 

25 which, alternatively, can be used in methods for identifying 
compounds useful for the treatment of such disorders. For 
example, the effect of the compound on the fingerprint gene 
expression normally displayed in connection with disorders 
involving tumor progression can be used to evaluate the 

30 efficacy of the compound as a treatment for such a disorder, 
or can, additionally, be used to monitor patients undergoing 
clinical evaluation for the treatment of the disorder. 
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"Fingerprint pattern," as used herein, refers to the 
pattern generated when the expression pattern of a series 
(which can range from two up to all the fingerprint genes 
which exist for a given state) of fingerprint genes is 
determined. A fingerprint pattern can be used in the same 
diagnostic, prognostic and compound identification methods 
as the expression of a single fingerprint gene. 

"Target gene," as used herein, refers to a 
differentially expressed gene involved in tumor progression 
such that modulation of the level of target gene expression 
or of target gene product activity can act to prevent and/ or 
ameliorate symptoms of the tumor progression. Compounds 
that modulate the expression of the target gene or the 
activity of the target gene product can be used in the 
treatment of neoplastic diseases, including, for example, 
disorders involving the progression to a metastatic state. 
Still further, compounds that modulate the expression of the 
target gene or activity of the target gene product can be 
used in treatments to prevent benign cells from attaining a 
malignant state. Still further, compounds that modulate the 
expression of the target gene or activity of the target gene 
product can be used to design a preventive intervention in 
pre-neoplastic cells in individuals at high risk. 

Further, "pathway genes" are defined via the ability 
of their products to interact with other gene products 
involved in tumor progression disorders. Pathway genes can 
also exhibit target gene and/or fingerprint gene 
characteristics . 

The present invention includes the products of such 
fingerprint, target, and pathway genes, as well as 
antibodies to such gene products. Furthermore, the 
engineering and use of cell-based and/ or animal-based models 
of tumor progression disorders, including disorders 



involving metastasis, to which such gene products can 
contribute, are described. 

The present invention also relates to methods for 
prognostic and diagnostic evaluation of tumor progression 
5 conditions, and for the identification of subjects 
containing cells predisposed to such conditions. 
Furthermore, the invention provides methods for evaluating 
the efficacy of therapies for disorders involving tumor 
progression, and for monitoring the progress of patients 
10 participating in clinical trials for the treatment of such 
diseases. 

The tumor progression disorders described herein can 
include disorders involved in the progression of such human 
cancers as, for example, human melanomas, breast, 

15 gastrointestinal, such as esophageal, stomach, colon, bowel, 
colorectal and rectal cancers, prostate, bladder, 
testicular, ovarian, uterine, cervical, brain , lung, 
bronchial, larynx, pharynx, liver, pancreatic, thyroid, 
bone, leukemias, lymphomas, and various types of skin 

20 cancers. 

The invention also provides methods for the 
identification of compounds that modulate the expression of 
genes or the activity of gene products involved in tumor 
progression, including the progression of metastatic 

25 neoplastic diseases, as well as methods for the treatment of 
such diseases. Such methods can, for example, involve the 
administration of such compounds to individuals exhibiting 
symptoms or markers of tumor progression, such as markers 
for metastatic neoplastic diseases. 

30 This invention is based, in part on systematic 

search strategies involving in vivo and in vitro paradigms 
of tumor progression, including the progression to 
metastatic disease, coupled with sensitive and high 



throughput gene expression assays, to identify genes 
differentially expressed in tumor cells relative to normal 
cells and/or relative to tumor cells at a different tumor 
progression stage. In contrast to approaches that merely 
evaluate the expression of a given gene product presumed to 
play a role in one or another of the various stages of tumor 
progression, such as, for example the progression to a 
metastatic disease process, the search strategies and assays 
used herein permit the identification of all genes, whether 
known or novel, which are differentially expressed in tumor 
cells relative to normal cells or relative to tumor cells at 
a different stage of tumor progression. 

This comprehensive approach and evaluation permits 
the discovery of novel genes and gene products, as well as 
the identification of an array of genes and gene products 
(whether novel or known) involved in novel pathways that 
play a major role in the disease pathology. Thus, the 
present invention makes possible the identification and 
characterization of targets useful for prognosis, diagnosis, 
monitoring, rational drug design, and/or other therapeutic 
intervention of tumor progression disorders, including 
disorders involving metastasis. 

The Example presented in Section 6, below, 
demonstrates the successful use of tumor progression search 
strategies of the invention to identify genes which are 
differentially expressed within tumor cells relative to 
tumor cells at a different stage of tumor progression. 
Specifically, the Example identifies a gene which is 
differentially expressed in metastatic cell populations 
relative to benign, non-malignant tumor cells. 

This gene, referred to herein as the 030 gene 
(fomy030 in the mouse and fohy030 in humans), is a novel 
gene which is expressed at a many-fold higher level in non- 



metastatic tumor cells relative to its expression in 
metastatic tumor cells. The gene appears in mice and has 
the cDNA sequence shown in Fig. 3A and 3B (SEQ ID NO: 2). A 
homologous gene, referred to herein as the fohy030 gene, 
5 appears in humans and has the cDNA sequence shown in Fig. 5 
(SEQ ID NO: 6). An alternative splice form of the human cDNA 
has the sequence shown in FIG. 6 (SEQ ID NO: 8). Unless 
stated expressly otherwise, any general reference to the 030 
gene hereinafter refers to both the murine (fomy030) and 

10 human (fohy030) homo logs of this gene. 

The identification of the 030 gene and the 
characterization of its expression in particular stages of 
metastatic spread provides, therefore, newly identified 
targets for the diagnosis, prevention, and treatment of 

15 tumor progression disorders, including metastatic neoplastic 
diseases. 

Its expression pattern indicates that the 030 gene 
product acts to inhibit tumor progression. For example, a 
reduction in the level of 030 gene expression correlates 
20 with an increase in a cell's metastatic potential kit, a 
reduction of 030 gene product in tumor cells can induce or 
predispose a cell to progress to a metastatic state. 

Hence, any method which can bring about an increase 
in the amount of 030 gene product can inhibit or slow the 
25 progression to metastasis. In fact, it is possible that the 
030 gene product exhibits general tumor inhibition 
properties. 

A cDNA clone of the murine homolog, designated 
fomy030, is described herein in FIGS. 3A and 3B (SEQ ID 
30 N0:2) (nucleotide sequence and amino acid sequence), and was 
derived from fomy030 mRNA. However, as used herein, fomy030 
cDNA refers to any DNA sequence that encodes the amino acid 
sequence depicted in FIGS. 3A and 3B (SEQ ID NO:3). 

- 12 - 



A cDNA clone of the human homolog, designated 
fohy030, is shown in FIG. 5 (SEQ ID NO: 6) (nucleotide 
sequence and amino acid sequence) . An alternative splice 
form of fohy030 is shown in FIG. 6 (SEQ ID NO:8). Both were 
5 obtained using the entire mouse fomy03 0 cDNA as a probe. 
However, as used herein, fohy030 cDNA refers to any DNA 
sequence that encodes the amino acid sequences depicted in 
FIG. 5 (SEQ ID N0:7) and FIG. 6 (SEQ ID N0:9). 

3.1. DEFINITIONS 

10 "Tumor progression," as used herein, refers to any 

event which, first, promotes the transition of a normal, 
non-neoplastic cell to a cancerous, neoplastic one. Such 
events include ones which occur prior to the onset of 
neoplasia, and which predispose, or act as a step toward, 

15 the cell becoming neoplastic. These events can, for 

example, include ones which cause a normal cell to exhibit a 
pre-neoplastic phenotype. Second, such events also include 
ones which bring about the transition from a pre-neoplastic 
state to a neoplastic one. Such events can, for example, 

20 include ones which promote unhindered cell proliferation 
and/or tumor cell invasion of adjacent tissue, which are 
viewed as hallmarks of the neoplastic state. Third, tumor 
progression can include events which promote the transition 
of a tumor cell to a metastatic state. Within each state, 

25 ( e.g. , pre-neoplastic, neoplastic and metastatic) the term 
"tumor progression" as used herein can also refer to the 
disorder severity or aggressiveness a cell exhibits. 

Because multiple tumor progression events occur as a 
cell progresses from a normal to neoplastic and metastatic 

30 states, certain cells will have undergone a different set of 
such tumor progression events. As such, such cells are 



- 13 - 



referred to herein as belonging to different "tumor 
progression stages." 

A "disorder involving tumor progression" or a "tumor 
progression disorder," as used herein, refers to the state 
5 of a cell or cells which have undergone or are in the 

process of undergoing a tumor progression event, as defined 
above . 

"Differential expression," as used herein, refers to 
both quantitative, as well as qualitative differences in the 

10 genes' temporal and/or cellular expression patterns among, 

for example, normal and neoplastic tumor cells, and/or among 
tumor cells which have undergone different tumor progression 
events. Differentially expressed genes can represent 
"fingerprint genes," and/or "target genes." 

15 "Fingerprint gene," as used herein, refers to a 

differentially expressed gene whose expression pattern can 
be utilized as part of a prognostic or diagnostic marker for 
the evaluation of tumor progression, or which, 
alternatively, can be used in methods for identifying 

20 compounds useful for the treatment of tumor progression. 

For example, the effect of the compound on the fingerprint 
gene expression normally displayed in connection with tumor 
progression can be used to evaluate the efficacy of the 
compound as a treatment for tumor progression, or can, 

25 additionally, be used to monitor patients undergoing 

clinical evaluation for the treatment of tumor progression. 

"Fingerprint pattern," as used herein, refers to the 
pattern generated when the expression pattern of a series 
(which can range from two up to all the fingerprint genes 

30 which exist for a given state) of fingerprint genes is 

determined. A fingerprint pattern can be used in the same 
diagnostic, prognostic and compound identification methods 
as the expression of a single fingerprint gene. 



"Target gene," as used herein, refers to a 
differentially expressed gene involved in tumor progression 
such that modulation of the level of target gene expression 
or of target gene product activity can act to prevent and/ or 
5 ameliorate symptoms of the tumor progression. Compounds 
that modulate target gene expression or activity of the 
target gene product can be used in the treatment of tumor 
progression and tumor progression disorders, including, for 
example, disorders involving the progression to a metastatic 
10 state. 

Further, "pathway genes" are defined via the ability 
of their products to interact with other gene products 
involved in tumor progression. Pathway genes can also 
exhibit target gene and/or fingerprint gene characteristics. 

15 4. DESCRIPTION OP THE FIGURES 

FIG. 1 is a Northern blot confirming differential 
regulation of the 030 gene. Total RNA (12^g/lane) obtained 
from Fl (lanes 1 and 3) and F10 (lanes 2 and 4) melanoma 
cell cultures was hybridized with a cDNA probe prepared by 

20 random priming of reamplified romy030 band. (See materials 
and methods below in Section 6.1.). The romy030 probe 
identifies an RNA band of approximately 3 kb, corresponding 
to a fomy030 mRNA. 

FIG. 2 is a nucleotide sequence of romy030 band (SEQ 

25 ID NO:l) . 

FIGS. 3A and 3B are representations of the 
nucleotide and derived amino acid sequences of cDNA clone 
fomy030 (SEQ ID NOs:2 [nucleotide sequence] and 3 [amino 
acid sequence]) derived from fomy030 mRNA. 

30 FIG. 4 is a Northern blot analysis confirming 

differential regulation of the fomy030 gene. Lane 1 is B16 
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Fl, lane 2 is B16 F10, and lanes 3-6 are B16 H5, B16 H6, B16 
H7 and B16 H8. 

FIG* 5 is a representation of the nucletide and 
deduced amino acid sequences of cDNA clone of fohy030 (SEQ 
5 ID N0s:6 [nucleotide sequence] and 7 [amino acid sequence]), 

FIG, 6 is a comparison of the nucletide and deduced 
amino acid sequences of another cDNA clone of fohy03 0 (SEQ 
ID N0s:8 [nucleotide sequence] and 9 [amino acid sequence]). 

In FIGS. 3 A and 3B, the nucleotide sequence is 
10 numbered starting at the first nucleotide, whereas in FIGS. 
5 and 6, the nucleotide sequence is numbered starting at the 
ATG start codon. 

5. DETAILED DESCRIPTION OF THE INVENTION 

Methods and compositions for the prevention, 

15 treatment and diagnosis of tumor progression, including 

tumor progression involving metastatic disorders, in cells 
involved in human tumors. Such human tumors may include, 
for example, human melanomas, breast, gastrointestinal 
tumors such as esophageal, stomach, duodenal, colon, 

20 colorectal and rectal cancers, prostate, bladder, 

testicular, ovarian, uterine, cervical, brain, lung, 
bronchial, larynx, pharynx, liver, pancreatic, thyroid, 
bone, various types of skin cancers and other neoplastic 
conditions such as leukemias, lymphomas. The invention is 

25 based, in part, on the evaluation and expression and role of 
all genes that are differentially expressed in tumor cells 
relative to normal cells and/or relative to tumor cells at a 
different stage of tumor progression. This permits the 
definition of disease pathways and identification of targets 

30 in such pathways that are useful for diagnosis, prevention 
and treatment of tumor progression, including the tumor 
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progression disorders involving metastatic neoplastic 
diseases. 

Genes, termed "target genes" and/ or "fingerprint 
genes" are described which are differentially expressed in 
5 tumor cells relative to their expression in normal cells or 
relative to their expression in tumor cells which are at a 
different stage of tumor progression* Additionally, genes, 
termed "pathway genes" are described whose gene products 
exhibit an ability to interact with gene products involved 

10 tumor progression, including tumor progression disorders 
involving metastatic neoplastic disorders* Pathway genes 
can additionally have fingerprint and/or target gene 
characteristics. Methods for the identification of such 
fingerprint, target, and pathway genes are also described. 

15 Further, the gene products of such fingerprint, 

target, and pathway genes are described in Section 5.2.2, 
antibodies to such gene products are described in Section 
5.2.3, as are cell-and animal-based models of tumor 
progression disorders to which such gene products can 

20 contribute, in Section 5.2.4. 

Methods for the identification of compounds which 
modulate the expression of genes or the activity of gene 
products involved in tumor progression are described in 
Section 5.3. Methods for monitoring the efficacy of 

25 compounds during clinical trials are described in Section 
5.3.5. Additionally described, below, are methods for 
treatment of tumor progression disorders, including 
metastatic diseases. 

Also discussed, below, are methods for prognostic 

3 0 and diagnostic evaluation of tumor progression and disorders 
involving tumor progression, including metastatic disorders, 
and, further, for the identification of subjects exhibiting 
a predisposition to such disorders. 
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5*1. IDENTIFICATION OP DIFFERENTIALLY 
EXPRESSED GENES 

Described herein are methods for the identification 

of differentially expressed genes which are involved in 

5 tumor progression. There exist a number of levels or stages 

at which the differential expression of such genes can be 

exhibited. For example, differential expression can occur 

in tumor cells relative to normal cells , or in tumor cells 

within different stages of tumor progression. For example, 

10 genes can be identified which are differentially expressed 
in pre-neoplastic versus neoplastic cells. Such genes can 
include, for example, ones which promote unhindered cell 
proliferation or tumor cell invasion of adjacent tissue, 
both of which are viewed as hallmarks of the neoplastic 

15 state. Further, differential expression can occur in benign 
f e.q. , non-malignant) tumor cells versus metastatic, 
malignant tumor cells. Still further, differential 
expression can occur among cells within any one of these 
states ( e.g. f pre-neoplastic, neoplastic and metastatic) , 

20 and can indicate, for example, a difference in tumor 

progression severity or aggressiveness of one cell relative 
to that of another cell within the same state. 

Methods for the identification of such 
differentially expressed genes are described, below, in 

25 Section 5.1.1. Methods for the further characterization of 
such differentially expressed genes, and for their 
categorization as target and/or fingerprint genes, are 
presented, below, in Section 5.3. 

"Differential expression" as used herein refers to 

30 both quantitative, as well as qualitative differences in the 
genes' temporal and/or tissue expression patterns. Thus, a 
differentially expressed gene can qualitatively have its 
expression activated or completely inactivated in, for 



example, normal versus tumor progression states, in cells 
within different tumor progression states or among cells 
within a single given tumor progression state. Such a 
qualitatively regulated gene will exhibit an expression 
5 pattern within a given state which is detectable by standard 
techniques in one such state, but is not detectable in both 
states being compared. "Detectable," as used herein, refers 
to an RNA expression level which is detectable via the 
standard techniques of differential display, RT (reverse 
10 transcriptase) -coupled PGR, Northern and/or RNase protection 
analyses. 

Alternatively, a differentially expressed gene can 
exhibit an expression level which differs, i.e. , is 
quantitatively increased or decreased in normal versus tumor 

15 progression states, in cells within different tumor 

progression states or among cells within a single given 
tumor progression state. 

The degree to which expression differs need only be 
large enough to be visualized via standard characterization 

20 techniques, such as, for example, the differential display 
technique described below. Other standard, well-known 
characterization techniques by which expression differences 
can be visualized include, but are not limited to, 
quantitative RT (reverse transcriptase) -coupled PCR and 

25 Northern analyses and RNase protection techniques. 

Differentially expressed genes can be further 
described as target genes and/ or fingerprint genes. 
"Fingerprint gene," as used herein, refers to a 
differentially expressed gene whose expression pattern can 

30 be utilized as part of a prognostic or diagnostic marker in 
tumor progression evaluation, or which, alternatively, may 
be used in methods for identifying compounds useful for the 
prevention or treatment of tumor progression and tumor 



progression disorders , including metastatic disorders. A 
fingerprint gene can also have the characteristics of a 
target gene or a pathway gene (see below, in Section 5.2). 

"Fingerprint pattern," as used herein, refers to the 
5 pattern generated when the expression pattern of a series 
(which can range from two up to all the fingerprint genes 
which exist for a given state) of fingerprint genes is 
determined. A fingerprint pattern can be used in the same 
diagnostic, prognostic and compound identification methods 

10 as the expression of a single fingerprint gene. 

"Target gene," as used herein, refers to a 
differentially expressed gene involved in tumor progression 
in a manner by which modulation of the level of target gene 
expression or of target gene product activity can act to 

15 prevent and/or ameliorate symptoms of disorders involving 

tumor progression. Tumor progression disorders include, for 
example, disorders involved in human tumors, including, but 
not limited to human melanomas, breast, gastrointestinal, 
such as esophageal, stomach, colon, bowel, colorectal and 

20 rectal cancers, prostate, bladder, testicular, ovarian, 

uterine, cervical, brain, lung, bronchial, larynx, pharynx, 
liver, pancreatic, thyroid, bone, leukemias, lymphomas and 
various types of skin cancers. A target gene can also have 
the characteristics of a fingerprint gene and/ or a pathway 

25 gene (as described, below, in Section 5.2). 

5.1.1. METHODS FOR THE IDENTIFICATION OF 
DIFFERENTIALLY EXPRESSED GENE8 

A variety of methods can be utilized for the 

identification of genes which are involved in tumor 
30 progression. Described in Section 5.1.1.1 are experimental 
paradigms which can be utilized for the generation of 
samples which can be used for the identification of such 



genes ♦ Material generated in paradigm categories can be 
characterized for the presence of differentially expressed 
gene sequences as discussed, below, in Section 5. 1.1.2. 



5. 1.1.1. PARADIGMS FOR THE IDENTIFICATION 
5 OF DIFFERENTIALLY EXPRESSED GENES 

Paradigms which represent models of tumor 

progression states are described herein. These paradigms 

can be utilized for the identification of genes which are 

differentially expressed in normal cells versus cells in 

10 tumor progression states, in cells within different tumor 
progression states or among cells within a single given 
tumor progression state. 

The paradigms described herein include at least two 
groups of cells of a given cell type, preferably genetically 

15 matched cells ( e.g. , cells derived from variants of the same 
cell line, or cells derived from a single individual or 
biological sample) , whose expression patterns are compared 
and analyzed for differential expression. Methods for the 
analysis of paradigm material are described, below, in 

20 Section 5.1.1.2. 

Once a particular gene has been identified through 
the use of one paradigm, its expression pattern can be 
further characterized, for example, by studying its 
expression in a different paradigm. A gene can, for 

25 example, be regulated one way, i.e. . can exhibit one 

differential gene expression pattern, in a given paradigm, 
but can be regulated differently in another paradigm. The 
use, therefore, of multiple paradigms can be helpful in 
distinguishing the roles and relative importance of 

30 particular genes in tumor progression. 

In one embodiment of such a paradigm, referred to 
herein as the " in vitro " paradigm, cell lines can be used to 
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identify genes which are differentially expressed in tumor 
progression states. Differentially expressed genes are 
detected, as described herein, by comparing the pattern of 
gene expression between the experimental and control 
5 conditions. In such a paradigm, genetically matched tumor 
cell lines ( e.g. , variants of the same cell line) are 
generally utilized. For example, the gene expression 
pattern of two variant cell lines can compared, wherein one 
variant exhibits characteristics of one tumor progression 

10 state while the other variant exhibits characteristics of 

another tumor progression state. Alternatively, two variant 
cell lines, both of which exhibit characteristics of the 
same tumor progression state, but which exhibit differing 
degrees of tumor progression disorder severity or 

15 aggressiveness. Further, genetically matched cell lines can 
be utilized, one of which exhibits characteristics of a 
tumor progression state, while the other exhibits a normal 
cellular phenotype. 

The variant cell lines utilized herein can exhibit 

20 such tumor progression characteristics as, for example, a 
high or low metastatic potential, which refers to the 
likelihood that a cell will give rise to a distant site 
tumor mass. Alternatively, one or more such variant cell 
lines can exhibit pre-neoplastic characteristics or can 

25 exhibit characteristics generally associated with one or 

more neoplastic cell phenotypes, such as, for example, cell 
proliferation or invasion phenotypes. 

In accordance with this aspect of the invention, the 
cell line variants are cultured under appropriate 

30 conditions, the cells are harvested, and RNA is isolated and 
analyzed for differentially expressed genes, as described in 
detail in Section 5.1.1.2, below. 



- 22 - 



Examples of cell lines that can be used as part of 
such in vitro paradigms include but are not limited to 
variants of melanoma cell lines, such as, for example, the 
murine melanoma B16 Fl cell line which exhibits a low 
5 metastatic potential and the melanoma B16 F10 cell line 
which exhibits a high metastatic potential (Fidler, I. J., 
1973, Nature New Biol 242 :148-149) ; human colon cell lines, 
such as, for example KM12c (tumor cell line with low 
metastatic potential) and the KM20L4 (tumor cell line with 

10 high metastatic potential; Morikawa K. , et al. . 1988, Cancer 
Research 48.: 1943-1948) ; prostatic tumor cell lines, such as, 
for example, DU 145 (non metastatic tumor cell line) and PC- 
3-M (high metastatic potential tumor cell line; Karmali, 
R.A. et al. , 1987, Anticancer Res. 7:1173-1180, and 

15 Koziowski, J.M. et al. . 1984, Cancer Research 44 : 3522-3529) ; 
and breast carcinoma tumor cell lines, such as, for example, 
MCF-7 (non metastatic tumor cell line) and MDA-MB-435 (high 
metastatic potential tumor cell line; Watts CK. et al, . 
1994, Breast Cancer Res. Treat. 31:95-105 and Rose, D.P. et 

20 al. . 1993, J. Natl. Cancer Inst. 85:1743-1747). 

As presented in the Example presented in Section 6, 
below, this paradigm has been successfully utilized to 
identify a gene, referred to herein as the 030 gene, which 
is differentially expressed in cells exhibiting a high 

25 metastatic potential relative to cells exhibiting a low 
metastatic potential. Specifically, the 030 gene is 
expressed at a many-fold higher level in low metastatic 
potential cells relative to cells exhibiting a high 
metastatic potential. 

30 In a second paradigm, referred to herein as the in 

vivo paradigm, animal models of tumor progression disorders 
can be utilized to discover differentially expressed gene 
sequences. The in vivo nature of such tumor progression 



models can prove to be especially predictive of the 
analogous responses in living patients. 

A variety of tumor progression animal models can be 
used for as part of the in vivo paradigms. For example, 
5 animal models of tumor progression may be generated by 
passaging tumor cells in animals ( e.g. , mice) , leading to 
the appearance of tumors within these animals. 

Additional animal models, some of which may exhibit 
differing tumor progression characteristics, may be 

10 generated from the original animal models described above. 

For example, the tumors which result in the original animals 
can be removed and grown in vitro . Cells from these iq 
vitro cultures can then be passaged in animals and tumors 
resulting from this passage can then be isolated. RNA from 

15 pre-passage cells, and cells isolated after one or more 
rounds of passage can then be isolated and analyzed for 
differential expression. The differential expression can be 
compared to the metastatic potential expression of such 
cells. These cells can now represent cells from different 

20 tumor progression states, or cells within a given tumor 

progression state exhibiting differing degrees of severity 
or aggressiveness. Such passaging techniques can utilizing 
any of the variant cell lines described, above, for the in 
vitro paradigms. 

25 Additionally, animal models for tumor progression 

which can be utilized for such an in vivo paradigm include 
any of the animal models described, below, in Section 5.7.1. 
Other models include transgenic mouse model for melanoma 
(Mintz, B. and Silvers, W.K., 1993, Proc. Natl. Acad. Sci. 

30 USA 2jQt: 8817-8812) , transgenic mice which carry specific 

adenomatous polyposis coli (APC) gene mutations (Fodde, R. , 
et al., 1994, Proc. Natl. Acad. Sci. USA 91:8969-8973) and 
the transgenic mouse in which the mammary tumor virus LTR/c- 



myc gene is anomalously expressed (Leder, A., et al., 1986, 
Cells 4£:485-495) . 

A third paradigm, referred to herein as the 
"specimen paradigm," utilizes samples from surgical and 
5 biopsy specimens. Such specimens can represent normal 

tissue, primary, secondary or metastasized tumors obtained 
from patients having undergone surgical treatment for 
disorders involving tumor progression such as, for example, 
melanomas, colon carcinomas, lung carcinomas, prostatic 

10 cancers and breast cancers. 

Surgical specimens can be procured under standard 
conditions involving freezing and storing in liquid nitrogen 
(see, for example, Karmali, R.A., et aL , 1983, Br, J. 
cancer 48:689-696.) RNA from specimen cells is isolated by, 

15 for example, differential centrifugation of homogenized 

tissue, and analyzed for differential expression relative to 
other specimen cells, preferably cells obtained from the 
same patient. 

In paradigms designed to identify genes which are 

20 involved in tumor progression, compounds known to have an 
ameliorative effect on the tumor progression symptoms can 
also be used in paradigms to detect differentially expressed 
genes. Such compounds can include known therapeutics, as 
well as compounds that are not useful as therapeutics due to 

25 their harmful side effects. For example, tumor cells that 
are cultured as explained in this Section, above, can be 
exposed to one of these compounds and analyzed for 
differential gene expression with respect to untreated tumor 
cells, according to the methods described below in Section 

30 5.1.1.2. In principle, however, according to the paradigm, 
any cell type involved in tumor progression and disorders 
thereof can be treated by these compounds at any stage of 
the tumor progression process. 



Cells involved in tumor progression can also be 
compared to unrelated cells ( e.g. . fibroblasts) which have 
been treated with the compound, such that any generic 
effects on gene expression that might not be related to the 
5 disease or its treatment may be identified. Such generic 
effects might be manifest, for example, by changes in gene 
expression that are common to the test cells and the 
unrelated cells upon treatment with the compound. 

By these methods, the genes and gene products upon 
10 which these compounds act can be identified and used in the 
assays described below to identify novel therapeutic 
compounds for inhibition of tumor progression and the 
treatment of tumor progression disorders, including 
metastatic diseases. 

15 5.1.1.2. ANALYSIS OF PARADIGM MATERIAL 

In order to identify differentially expressed genes, 
RNA, either total or mRNA, can be isolated from cells 
utilized in paradigms such as those described earlier in 
Section 5.1.1.1. Any RNA isolation technique which does not 

20 select against the isolation of mRNA can be utilized for the 
purification of such RNA samples. See, for example, 
Ausubel, F.M. et al. , eds., 1987-1993, Current Protocols in 
Molecular Biology . John Wiley & Sons, Inc. New York, which 
is incorporated herein by reference in its entirety. 

25 Additionally, large numbers of tissue samples can readily be 
processed using techniques well known to those of skill in 
the art, such as, for example, the single-step RNA isolation 
process of Chomczynski, P. (1989, U.S. Patent No. 
4,843,155), which is incorporated herein by reference in its 

30 entirety. 

Transcripts within the collected RNA samples which 
represent RNA produced by differentially expressed genes can 



be identified by utilizing a variety of methods which are 
well known to those of skill in the art. For example, 
differential screening (Tedder, T.F. et al. . 1988, Proc. 
Natl. Acad. Sci. USA 8£: 208-212) , subtractive hybridization 
5 (Hedrick, S.M. et al. . 1984, Nature 3£8_: 149-153 ; Lee, S.W. 
et al. . 1984, Proc. Natl. Acad. Sci. USA 88:2825), and, 
preferably, differential display (Liang, P. and Pardee, 
A.B., 1993, U.S. Patent No. 5,262,311, which is incorporated 
herein by reference in its entirety) , can be utilized to 

10 identify nucleic acid sequences derived from genes that are 
differentially expressed. 

Differential screening involves the duplicate 
screening of a cDNA library in which one copy of the library 
is screened with a total cell cDNA probe corresponding to 

15 the mRNA population of one cell type while a duplicate copy 
of the cDNA library is screened with a total cDNA probe 
corresponding to the mRNA population of a second cell type. 
For example, one cDNA probe can correspond to a total cell 
cDNA probe of a cell type or tissue derived from a control 

20 subject, while the second cDNA probe can correspond to a 

total cell cDNA probe of the same cell type derived from an 
experimental subject. Those clones which hybridize to one 
probe but not to the other potentially represent clones 
derived from genes differentially expressed in the cell type 

25 of interest in control versus experimental subjects. 

Subtractive hybridization techniques generally 
involve the isolation of mRNA taken from two different 
sources, e.g. . control and experimental tissue, the 
hybridization of the mRNA or single-stranded cDNA reverse- 

30 transcribed from the isolated mRNA, and the removal of all 
hybridized, and therefore double-stranded, sequences. The 
remaining non-hybridized, single-stranded cDNAs, potentially 
represent clones derived from genes that are differentially 



expressed in the two mRNA sources. Such single-stranded 
cDNAs are then used as the starting material for the 
construction of a library comprising clones derived from 
differentially expressed genes. 

The differential display technique describes a 
procedure, utilizing the well-known polymerase chain 
reaction (PGR; the experimental embodiment set forth in 
Mullis, K.B., 1987, U.S. Patent No. 4,683,202) which allows 
for the identification of sequences derived from genes which 
are differentially expressed. First, isolated RNA is 
reverse-transcribed into single-stranded cDNA, utilizing 
standard techniques which are well known to those of skill 
in the art. Primers for the reverse transcriptase reaction 
can include, but are not limited to, oligo dT-containing 
primers, preferably of the 3' primer type of oligonucleotide 
described below. Next, this technique uses pairs of PCR 
primers, as described below, which allow for the 
amplification of clones representing a random subset of the 
RNA transcripts present within any given cell. Utilizing 
different pairs of primers allows each of the mRNA 
transcripts present in a cell to be amplified. Among such 
amplified transcripts can be identified those which have 
been produced from differentially expressed genes. 

The 3' oligonucleotide primer of the primer pairs 
can contain an oligo dT stretch of 10-13 dT nucleotides at 
its 5' end, preferably 11, which hybridizes to the poly (A) 
tail of mRNA or to the complement of a cDNA reverse 
transcribed from an mRNA poly (A) tail. Second, in order to 
increase the specificity of the 3' primer, the primer can 
contain one or more, preferably two, additional nucleotides 
at its 3' end. Because, statistically, only a subset of the 
mRNA derived sequences present in the sample of interest 
will hybridize to such primers, the additional nucleotides 



allow the primers to amplify only a subset of the mRNA 
derived sequences present in the sample of interest. This 
is preferred in that it allows more accurate and complete 
visualization and characterization of each of the bands 
5 representing amplified sequences* 

The 5' primer can contain a nucleotide sequence 
expected, statistically, to have the ability to hybridize to 
cDNA sequences derived from the tissues of interest. The 
nucleotide sequence can be an arbitrary one, and the length 
10 of the 5' oligonucleotide primer can range from about 9 to 
about 15 nucleotides, with about 13 nucleotides being 
preferred. 

Additionally, arbitrary primer sequences cause the 
lengths of the amplified partial cDNAs produced to be 

15 variable, thus allowing different clones to be separated by 
using standard denaturing sequencing gel electrophoresis. 

PCR reaction conditions should be chosen which 
optimize amplified product yield and specificity, and, 
additionally, produce amplified products of lengths which 

20 can be resolved utilizing standard gel electrophoresis 

techniques. Such reaction conditions are well known to those 
of skill in the art, and important reaction parameters 
include, for example, length and nucleotide sequence of 
oligonucleotide primers as discussed above, and annealing 

25 and elongation step temperatures and reaction times. 

The pattern of clones resulting from the reverse 
transcription and amplification of the mRNA of two different 
cell types is displayed via sequencing gel electrophoresis 
and compared. Differences in the two banding patterns 

30 indicate potentially differentially expressed genes. 

Once potentially differentially expressed gene 
sequences have been identified via bulk techniques such as, 
for example, those described above, the differential 
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expression of such putatively differentially expressed genes 
should be corroborated. Corroboration can be accomplished 
via, for example, such well-known techniques as Northern 
analysis, quantitative RT-coupled PCR or RNase protection. 
5 Upon corroboration, the differentially expressed 

genes can be further characterized, and can be identified as 
target and/or fingerprint genes, as discussed, below, in 
Section 5.1.4. 

Also, amplified sequences of differentially 

10 expressed genes obtained through differential display can be 
used to isolate the full length clones of the corresponding 
gene. The full-length coding portion of the gene can 
readily be isolated, without undue experimentation, by 
molecular biological techniques well known in the art. For 

15 example, the isolated differentially expressed amplified 
fragment can be labeled and used to screen a cDNA library. 
Alternatively, the labeled fragment can be used to screen a 
genomic library. 

PCR technology can also be utilized to isolate full- 

20 length cDNA sequences. As described in this section above, 
the isolated amplified gene fragments (of about at least 10 
nucleotides, preferrably longer, of about 15 nucleotides) 
obtained through differential display have their 5' terminal 
end at some random point within the gene and have 3' 

25 terminal ends at a position corresponding to the 3' end of 
the transcribed portion of the gene. Once nucleotide 
sequence information from an amplified fragment is 
obtained, the remainder of the gene (i.e y , the 5' end of the 
gene, when-utilizing differential display) can be obtained 

30 using, for example, RT PCR. 

In one embodiment of such a procedure for the 
identification and cloning of full length gene sequences, 
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RNA can be isolated, following standard procedures, from an 
appropriate tissue or cellular source. 

A reverse transcription reaction can then be 
performed on the RNA using an oligonucleotide primer 
5 complementary to the mRNA that corresponds to the amplified 
cloned fragment, for the priming of first strand synthesis. 
Because the primer is anti-parallel to the mRNA, extension 
will proceed toward the 5' end of the mRNA. The resulting 
RNA/DNA hybrid can then be "tailed" with guanines using a 

10 standard terminal transferase reaction, the hybrid can be 

digested with RNAase H, and second strand synthesis can then 
be primed with a poly-C primer. Using the two primers, the 
5' portion of the gene is then amplified using PCR. 
Sequences obtained can then be isolated and recombined with 

15 previously isolated sequences to generate a full-length cDNA 
of the differentially expressed genes of the invention. For 
a review of cloning strategies and recombinant DNA 
techniques which can be used, see, e.g. , Sambrook et al. , 
1989, Molecular Cloning, A Laboratory Manual, Cold Springs 

20 Harbor Press, N.Y.; and Ausubel et al. . 1989, Current 

Protocols in Molecular Biology, (Green Publishing Associates 
and Wiley Interscience, N.Y.). 

5.2 . METHODS POR THE IDENTIFICATION 
OF PATHWAY GENES 

25 Methods are described herein for the identification 

of pathway genes. "Pathway gene," as used herein, refers to 
a gene whose gene product exhibits the ability to interact 
with gene products involved in tumor progression. A pathway 
gene can be differentially expressed and, therefore, can 

30 have the characteristics of a target and/or fingerprint 
gene . 
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Any method suitable for detecting protein-protein 
interactions can be employed for identifying pathway gene 
products by identifying interactions between gene products 
and gene products known to be involved in tumor progression 
5 and tumor progression disorders, including metastatic 
disorders. Such known gene products can be cellular or 
extracellular proteins. Those gene products which interact 
with such known gene products represent pathway gene 
products and the genes which encode them represent pathway 
10 genes. 

Among the traditional methods which can be employed 
are co-immunoprecipitation, cross-linking and co- 
purification through gradients or chromatographic columns. 
Utilizing procedures such as these allows for the 

15 identification of pathway gene products. Once identified, a 
pathway gene product can be used, in conjunction with 
standard techniques, to identify its corresponding pathway 
gene. For example, at least a portion of the amino acid 
sequence of the pathway gene product can be ascertained 

20 using techniques well known to those of skill in the art, 
such as via the Edman degradation technique (see, e.g. , 
Creighton, 1983, Proteins: Structures and Molecular 
Principles, W.H. Freeman & Co., N.Y., pp. 34-49). The amino 
acid sequence obtained can be used as a guide for the 

25 generation of oligonucleotide mixtures that can be used to 
screen for pathway gene sequences. Screening can be 
accomplished, for example by standard hybridization or PCR 
techniques. Techniques for the generation of 
oligonucleotide mixtures and the screening are well known. 

30 (See, e.g. . Ausubel, supra . , and PCR Protocols: A Guide to 
Methods and Applications, 1990, Innis, M. et aL. eds. 
Academic Press, Inc., New York). 
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Additionally, methods can be employed which result 
in the simultaneous identification of pathway genes which 
encode the protein interacting with a protein involved in 
tumor progression and tumor progression disorders, including 
metastatic diseases. These methods include, for example, 
probing expression libraries with labeled protein known or 
suggested to be involved in metastatic diseases using this 
protein in a manner similar to the well known technique of 
antibody probing of Xgtll libraries. 

One method which detects protein interactions in 
vivo , the yeast two-hybrid system, is described in detail 
for illustration only and not by way of limitation. One 
version of this system has been described (Chien et al . , 
1991, Proc. Natl. Acad. Sci. USA, 88_: 9578-9582) and is 
commercially available from Clontech (Palo Alto, CA) . 

Briefly, utilizing such a system, plasmids are 
constructed that encode two hybrid proteins: the first 
hybrid protein consists of the DNA-binding domain of a 
transcription factor ( e.g. . activation protein) fused to a 
known protein, in this case, a protein known to be involved 
in tumor progression, and the second hybrid protein consists 
of the transcription factor's activation domain fused to an 
unknown protein that is encoded by a cDNA which has been 
recombined into this plasmid as part of a cDNA library. The 
plasmids. are transformed into a strain of the yeast 
Saccharomyces cerevisiae that contains a reporter gene 
( e.g. . lacZ l whose expression is regulated by the 
transcription factor's binding site. Either hybrid protein 
alone cannot activate transcription of the reporter gene. 
The DNA binding hybrid protein cannot activate transcription 
because it does not provide the activation domain function 
and the activation domain hybrid protein cannot activate 
transcription because it lacks the domain required for 



binding to its target site ( e.a. . it cannot localize to the 
transcription activator protein's binding site). 
Interaction between the DNA binding hybrid protein and the 
library encoded protein reconstitutes the functional 
5 transcription factor and results in expression of the 
reporter gene, which is detected by an assay for the 
reporter gene product. 

The two-hybrid system or related methodology can be 
used to screen activation domain libraries for proteins that 

10 interact with a known "bait" gene product. By way of 

example, and not by way of limitation, gene products (s^a*., 
030 gene products) known to be involved in tumor progression 
and tumor progression disorders, such as metastatic 
diseases, can be used as the bait gene products. Total 

15 genomic or cDNA sequences are fused to the DNA encoding an 
activation domain. This library and a plasmid encoding a 
hybrid of the bait gene product fused to the DNA-binding 
domain are cotransf ormed into a yeast reporter strain, and 
the resulting transformants are screened for those that 

20 express the reporter gene. For example, and not by way of 
limitation, the bait gene can be cloned into a vector such 
that it is translationally fused to the DNA encoding the 
DNA-binding domain of the GAL4 protein. The colonies are 
purified and the (library) plasmids responsible for reporter 

25 gene expression are isolated. The inserts in the plasmids 
are sequenced to identify the proteins encoded by the cDNA 

or genomic DNA. 

A cDNA library of a cell or tissue source which 
expresses proteins predicted to interact with the bait gene 
30 product can be made using methods routinely practiced in the 
art. According to the particular system described herein, 
the library is generated by inserting the cDNA fragments 
into a vector such that they are translationally fused to 
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the activation domain of GAL4 . This library can be co- 
transformed along with the bait gene-GAL4 fusion plasmid 
into a yeast strain which contains a lacZ gene whose 
expression is controlled by a promoter which contains a GAL4 
5 activation sequence. A cDNA encoded protein, fused to GAL4 
activation domain, that interacts with the bait gene product 
will reconstitute an active GAL4 transcription factor and 
thereby drive expression of the lacZ gene. Colonies which 
express lacZ can be detected by their blue color in the 

10 presence of X-gal. cDNA containing plasmids from such a 
blue colony can then be purified and used to produce and 
isolate the bait gene product interacting protein using 
techniques routinely practiced in the art. 

Once a pathway gene has been identified and 

15 isolated, it can be further characterized as, for example, 
discussed below, in Section 5.3. 

5.3. CHARACTERIZATION OF DIFFERENTIALLY 
EXPRESSED AND PATHWAY GENES 

Differentially expressed genes, such as those 
20 identified via the methods discussed, above, in Section 5.1, 
and pathway genes, such as those identified via the methods 
discussed, above, in Section 5.2, above, as well as genes 
identified by alternative means, can be further 
characterized by utilizing, for example, methods such as 
25 those discussed herein. Such genes will be referred to 
herein as "identified genes." 

Analyses such as those described herein, yield 
information regarding the biological function of the 
identified genes. An assessment of the biological function 
30 of the differentially expressed genes, in addition, will 
allow for their designation as target and/or fingerprint 
genes* 



- 35 - 



Specifically/ any of the differentially expressed 
genes whose further characterization indicates that a 
modulation of the gene's expression or a modulation of the 
gene product's activity can inhibit tumor progression will 
5 be designated "target genes," as defined, above, in Section 
5.1. Such target genes and target gene products, along with 
those discussed below, will constitute the focus of the 
compound discovery strategies discussed, below, in Section 
5.8. Further, such target genes, target gene products 
10 and/or modulating compounds can be used as part of the tumor 
progression disorder treatment methods described, below, in 
Section 5.9. 

Any of the differentially expressed genes whose 
further characterization indicates that such modulations 

15 does not positively affect tumor progression, but whose 
expression pattern contributes to a gene expression 
"fingerprint" pattern correlative of, for example, tumor 
progression will be designated a "fingerprint gene." 
"Fingerprint patterns" will be more fully discussed, below, 

20 in Section 5.11.1. It should be noted that each of the 

target genes can also function as fingerprint genes, as can 
all or a portion of the pathway genes. 

It should further be noted that the pathway genes 
can also be characterized according to techniques such as 

25 those described herein. Those pathway genes which yield 
information indicating that they are differentially 
expressed and that modulation of the gene's expression or a 
modulation of the gene product's activity can inhibit tumor 
progression or ameliorate tumor progression-associated 

30 symptoms will also be designated "target genes." Such 
target genes and target gene products, along with those 
discussed above, will constitute the focus of the compound 
discovery strategies discussed, below, in Section 5.8 and 



can be used as part of the treatment methods described in 
Section 5.9, below. 

It should be additionally noted that the 
characterization of one or more of the pathway genes can 
5 reveal a lack of differential expression, but evidence that 
modulation of the gene's activity or expression can, 
nonetheless, ameliorate symptoms of tumor progression. In 
such cases, these genes and gene products would also be 
considered a focus of the compound discovery strategies of 

10 Section 5.8, below and can be used as part of the treatment 
methods described in Section 5.9, below. 

In instances wherein a pathway gene's 
characterization indicates that modulation of gene 
expression or gene product activity cannot retard the tumor 

15 progression diseases of interest, but is differentially 

expressed and contributes to a gene expression fingerprint 
pattern correlative of, tumor progression states or 
disorders, such as metastatic diseases, such pathway genes 
can additionally be designated as fingerprint genes. 

20 A variety of techniques can be utilized to further 

characterize the identified genes. First, the nucleotide 
sequence of the identified genes, which can be obtained by 
utilizing standard techniques well known to those of skill 
in the art, can be used to further characterize such genes. 

25 For example, the sequence of the identified genes can reveal 
homologies to one or more known sequence motifs which can 
yield information regarding the biological function of the 
identified gene product. 

Second, an analysis of the tissue and/or cell type 

30 distribution of the mRNA produced by the identified genes 
can be conducted, utilizing standard techniques well known 
to those of skill in the art. Such techniques can include, 
for example, Northern analyses, RT-coupled PCR and RNase 



protection techniques. Such analyses provide information as 
to whether the identified genes are expressed in tissues 
expected to contribute to tumor progression. Such analyses 
can also provide quantitative information regarding steady 
5 state mRNA regulation, yielding data concerning which of the 
identified genes exhibits a high level of regulation in, 
preferably, tissues which can be expected to contribute to 
tumor progression. Additionally, standard in situ 
hybridization techniques can be utilized to provide 
10 information regarding which cells within a given tissue 

express the identified gene. Such an analysis can provide 
information regarding the biological function of an 
identified gene relative to given tumor progression in 
instances wherein only a subset of the cells within the 
15 tissue is thought to be relevant to the disorder. 

Third, the sequences of the identified genes can be 
used, utilizing standard techniques, to place the genes onto 
genetic maps, e.g. . mouse (Copeland, N.G. and Jenkins, N.A., 
1991, Trends in Genetics 7:113-118) and human genetic maps 
20 (Cohen, D., et al. . 1993, Nature 366:698-701) . Such mapping 
information can yield information regarding the genes' 
importance to human disease by, for example, identifying 
genes which map within genetic regions to which known 
genetic tumor progression disorders map. 
25 Fourth, the biological function of the identified 

genes can be more directly assessed by utilizing relevant in 
vivo and in vitro systems. In vivo systems can include, but 
are not limited to, animal systems which naturally exhibit 
symptoms of tumor progression, such as metastatic disease, 
30 or ones which have been engineered to exhibit such symptoms. 
For example, tumor progression animal models may be 
generated by injecting animals, such as mice, with tumor 
cells, some of which will give rise to tumors within the 
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injected animals. Among the cells which may be utilized for 
such a purpose are cells listed, above, in Section 5.1.1.1, 
such as the B16 cell variants. 

The role of identified gene products (e.g. . 030 gene 
products) can be determined by transfecting cDNAs encoding 
these gene products into appropriate cell lines, such as, 
for example, a B16 cell line variant, and analyzing the 
effect on tumor progression characteristics. For example, 
the role/ function of genes important in the progression of 
human colorectal cancers are assessed using the KM12c (low 
metastatic potential) and KM12L4 (highly metastatic) cells 
implanted into nude mice spleens and the number of hepatic 
tumors that develop are determined. The function of genes 
isolated using human colorectal tumors and their hepatic 
metastases are assessed by expressing the gene in the 
appropriate KM12 variant. Additionally, the role/ function 
of genes important in the progression of prostatic and 
breast cancers are assessed using appropriate cell lines 
described above in Section 5.1.1.1. Importantly, the 
role/ function of genes important in the progression of 
melanoma, colon, prostate and breast cancers in humans are 
assessed using biopsy specimens from patients having 
undergone surgical treatment, as described in Section 

5.1.1.1. above. 

Further, such systems can include, but are not 
limited to transgenic animal systems such as those 
described, above, in Section 5.7.1 below. In vitro systems 
can include, but are not limited to, cell-based systems 
comprising cell types known or suspected of contributing to 
tumor progression. Such cells can be wild type cells, or 
can be non-wild type cells containing modifications known to 
or suspected of, contributing to tumor progression. Such 
systems are discussed in detail, below, in Section 5.7.2. 



The procedure to identify and isolate the human homologue of 
the fomy030 gene is described, below, in Section 5.7.3. 

In further characterizing the biological function of 
the identified genes, the expression of these genes can be 
5 modulated within the in vivo and/or in vitro systems, i.e. , 
either over- or under-expressed, and the subsequent effect 
on the system then assayed. Alternatively, the activity of 
the product of the identified gene can be modulated by 
either increasing or decreasing the level of activity in the 

10 in vivo and/or in vitro system of interest, and its 
subsequent effect then assayed. 

The information obtained through such 
characterizations can suggest relevant methods for the 
treatment of tumor progression and tumor progression 

15 disorders involving the gene of interest. Further, relevant 
methods for controlling the spread of tumor cells involving 
the gene of interest can be suggested by information 
obtained from such characterization. For example, treatment 
can include a modulation of gene expression and/or gene 

20 product activity. Characterization procedures such as those 
described herein can indicate where such modulation should 
involve an increase or a decrease in the expression or 
activity of the gene or gene product of interest. Such 
methods of treatment are discussed, below, in Section 5.9. 

25 5.4. DIFFERENTIALLY EXPRESSED 

AND PATHWAY GENES 

Differentially expressed genes, such as those 
identified in Section 5.1.1, above, and pathway genes, such 
as those identified in Section 5.2, above, are described 
30 herein. 

The differentially expressed and pathway genes of 
the invention are listed below, in Table 1. The nucleotide 
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sequence for the differentially expressed fomy030 gene is 
shown in FIGS. 2 and 3A and 3B. Specifically, FIG. 2 
depicts the nucleotide sequence (SEQ ID NO:l) of the 
amplified cDNA band initially identified via differential 
display analysis, which is referred to herein as romy030. 
FIGS. 3 A AND 3B depict the nucleotide sequence (SEQ ID NO: 2) 
of a fomy030 cDNA clone which was isolated using a romy030 
probe. The deduced amino acid sequence also is shown in 
FIGS. 3A and 3B (SEQ ID NO: 3). FIG. 5 shows the nucleotide 
(SEQ ID NO: 6) and deduced amino acid sequences (SEQ ID NO: 7) 
of a fohy030 cDNA clone which was isolated using the entire 
mouse fomy030 cDNA as a probe. FIG. 6 shows an alternative 
splice form of fohy030 (SEQ ID NOs:8 and 9). 

Table 1 summarizes information regarding the further 
characterization of the differentially expressed fomy030 
gene of the invention. Table 2 lists E. coli clones, 
deposited with the Agricultural Research Service Culture 
Collection (NRRL) , which contain sequences found within the 
genes of Table 1. 

In Table 1, the paradigm used initially to detect 
the differentially expressed gene is described under the 
column headed "Paradigm of Original Detection." In this 
column, "I" indicates that gene expression is higher (L*3j., 
there is a greater steady state amount of detectable mRNA 
produced. by a given gene) in the indicated cell type 
relative to the other cell type, while "I" indicates that 
gene expression is lower ( i.e. . there is a lower steady 
state amount of detectable mRNA, produced by a given gene) 
in the indicated cell type relative to the other cell type. 
As indicated under this column, the 030 gene was initially 
identified via a differential screen between B16 Fl (low 
metastatic potential cells) and B16 F10 (high metastatic 
potential cells) in which 030 gene expression is lower in 



the high metastatic potential B16 F10 cell line than in the 
low metastatic potential B16 Fl cell line. 

The Table 1 column headed "Paradigm Expression 
Pattern" lists the cell type in which gene expression was 
5 initially detected. In the case of the 030 gene, gene 

expression was first detected in melanoma ( i. e. , B16) cells. 
"Detectable" as used herein, refers to levels of mRNA which 
are detectable, via standard differential display, Northern, 
RT-coupled PCR and/or RNase protection techniques which are 

10 well known to those of skill in the art. 

Cell types in which differential expression was 
detected are summarized in Table 1 under the column headed 
"Cell Type Detected in." In the case of the 030 gene, 
expression has additionally been detected within melanocyte 

15 cells. 

Additionally, in instances wherein the genes contain 
nucleotide sequences similar or homologous to sequences 
found in nucleic acid databases, references to such 
similarities are listed. Because the 030 gene is a novel 

20 gene, i.e. . no homologous gene sequences are present in the 
published databases, no such reference is listed. 

Finally, nucleotide sequences contained within the 
differentially expressed genes are listed in the Figures 
indicated under the heading "Seq." In the case of the 

25 fomy030 gene, such sequences are listed in FIGS. 2 and 3A 
and 3B, and for fohy030, in FIGS. 5 and 6. 

The genes listed in Table 1 can be obtained using 
cloning methods well known to those skilled in the art, 
including, but not limited to, the use of appropriate probes 

30 to detect the genes within an appropriate cDNA or gDNA 

(genomic DNA) library. (See, for example, Sambrook et al. , 
1989, Molecular Cloning: A Laboratory Manual, Cold Spring 
Harbor Laboratories, which is incorporated by reference 



herein in its entirety) . Probes for the novel sequences 
reported herein can be obtained directly from the isolated 
clones deposited with the NRRL, as indicated in Table 2, 
below. Alternatively, oligonucleotide probes for the novel 
5 genes can be synthesized, using techniques well known to 
those of skill in the art, based on the DNA sequences 
disclosed herein in FIGS. 2, 3 A, 3B, 5, and 6. 

The probes can be used to screen cDNA libraries 
prepared from an appropriate cell or cell line in which the 

10 gene is transcribed. For example, the genes described 

herein that were detected in melanocyte cells can be cloned 
from a cDNA library prepared from melanocyte cells such as, 
for example, melan-c (Hodgkinson, C.A., et al., 1993, Cell 
74:395-404), the cDNA libraries developed from the human 

15 melanoma cell line A2058 (Clontech, Palo Alto, CA) and cDNA 
libraries developed from the murine melanoma cell line K1735 
(Stratagene, La Jolla, CA) . Genomic DNA libraries can be 
prepared from any source. 

TABLE 1 

20 Differentially Expressed and Pathw ay Genes 
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Table 2, below, lists an E. coli strain as deposited 
with the NRRL, which contains an isolated plasmid fomy030 
clone* The clone contains a fomy030 cDNA in a pBlueScript 
SK- (Stratagene, La Jolla, CA) vector which was isolated 
5 from a mouse melanocyte cDNA library screened with a romy030 
probe, as described in Section 6.2, below. 



TABLE 2 



GENE 


STRAIN DEPOSITED 
WITH NRRL 


PLASMID CLONE CONTAINED 
WITHIN DEPOSITED STRAIN 


fomy030 
fohy030 


FOMY030 


pFOMY030 



As used herein, "differentially expressed gene" 
( i.e. . target and fingerprint genes) or "pathway gene" 
refers to (a) a gene containing: at least one of the DNA 

15 sequences disclosed herein (as shown in FIGS. 2, 3A, 3B, 5, 
and 6) or contained in the clones listed in Table 2, as 
deposited with the NRRL; (b) any DNA sequence that encodes 
the amino acid sequence encoded by: the DNA sequences 
disclosed herein (as shown in FIGS. 2, 3A, 3B, 5, and 6), 

20 contained in the clones, listed in Table 2, as deposited 
with the NRRL or contained within the coding region of the 
gene to which the DNA sequences disclosed herein (as shown 
in FIGS. 2, 3 A, 3B, 5, and 6) or contained in the clones 
listed in Table 2, as deposited with the NRRL, belong; (c) 

25 any DNA sequence that hybridizes to the complement of: the 
coding sequences disclosed herein (as shown in FIGS 2, 3A, 
3B, 5, and 6), contained in clones listed in Table 2, as 
deposited with the NRRL, or contained within the coding 
region of the gene to which the DNA sequences disclosed 
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herein (as shown in FIGS. 2, 3A, 3B, 5, and 6) or contained 
in the clones listed in Table 2, as deposited with the NRRL, 
belong under highly stringent conditions, e.g. , 
hybridization to filter-bound DNA in 0.5 M NaHP0 4 , 7% sodium 
dodecyl sulfate (SDS) , 1 mM EDTA at 65°C, and washing in 
0.1xSSC/0.1% SDS at 68°C (Ausubel F.M. et al. . eds., 1989, 
Current Protocols in Molecular Biology, Vol. I, Green 
Publishing Associates, Inc., and John Wiley & sons, Inc., 
New York, at p. 2.10.3) and encodes a gene product 
functionally equivalent to a gene product encoded by a gene 
of (a) , above and/or (d) any DNA sequence that hybridizes to 
the complement of: the coding sequences disclosed herein, 
(as shown in FIGS. 2, 3A, 3B, 5, and 6) contained in the 
clones listed in Table 2, as deposited with the NRRL or 
contained within the coding region of the gene to which DNA 
sequences disclosed herein (as shown in FIGS. 2, 3A, 3B, 5, 
and 6) or contained in the clones, listed in Table 2, as 
deposited with the NRRL, belong under less stringent 
conditions, such as moderately stringent conditions, e.g, , 
washing in 0.2xSSC/0.1% SDS at 42°C (Ausubel et al t , 1989, 
supra ) , yet which still encodes a gene product functionally 
equivalent to a gene product encoded by a gene of (a) , 
above . 

The invention also includes nucleic acid molecules, 
preferably DNA molecules, that hybridize to, and are 
therefore the complements of, the DNA sequences (a) through 
(d) , in the preceding paragraph. Such hybridization 
conditions can be highly stringent or less highly stringent, 
as described above. In instances wherein the nucleic acid 
molecules are deoxyoligonucleotides ( M oligos") , highly 
stringent conditions can refer, e.g. . to washing in 
6xSSC/0.05% sodium pyrophosphate at 37°C (for 14-base 
oligos) , 48°C (for 17-base oligos) , 55°C (for 20-base 



oligos) , and 60°C (for 23-base oligos) . These nucleic acid 
molecules can act as target gene antisense molecules, 
useful, for example, in target gene regulation and/ or as 
antisense primers in amplification reactions of target, 
5 fingerprint, and/or pathway gene nucleic acid sequences. 
Further, such sequences can be used as part of ribozyme 
and/or triple helix sequences, also useful for target gene 
regulation . Still further, such molecules can be used as 
components of diagnostic methods whereby tumor progression 

10 disorders can be detected. 

The invention also encompasses (a) DNA vectors that 
contain any of the foregoing coding sequences and/or their 
complements f i.e. . antisense) ; (b) DNA expression vectors 
that contain any of the foregoing coding sequences 

15 operatively associated with a regulatory element that 
directs the expression of the coding sequences; and (c) 
genetically engineered host cells that contain any of the 
foregoing coding sequences operatively associated with a 
regulatory element that directs the expression of the coding 

20 sequences in the host cell. As used herein, regulatory 

elements include but are not limited to inducible and non- 
inducible promoters, enhancers, operators and other elements 
known to those skilled in the art that drive and regulate 
expression. The invention includes fragments of any of the 

25 DNA sequences disclosed herein. 

In addition to the gene sequences described above, 
homologues of these gene sequences as can, for example be 
present in other species, preferably human in instances 
wherein the above described gene sequences are not human 

30 gene sequences, can be identified and can readily be 
isolated, without undue experimentation, by molecular 
biological techniques well known in the art. Further, there 
can exist genes at other genetic loci within the genome that 



encode proteins which have extensive homology to one or more 
domains of such gene products. These genes can also be 
identified via similar techniques. 

For example, the isolated differentially expressed 
5 gene sequence can be labeled and used to screen a cDNA 

library constructed from mRNA obtained from the organism of 
interest. Hybridization conditions will be of a lower 
stringency when the cDNA library was derived from an 
organism different from the type of organism from which the 

10 labeled sequence was derived. Alternatively , the labeled 
fragment can be used to screen a genomic library derived 
from the organism of interest, again, using appropriately 
stringent conditions. Such low stringency conditions will 
be well known to those of skill in the art, and will vary 

15 predictably depending on the specific organisms from which 
the library and the labeled sequences are derived. For 
guidance regarding such conditions see, for example, 
Sambrook et al. , 1989, Molecular Cloning, A Laboratory 
Manual, Cold Springs Harbor Press, N.Y.; and Ausubel et al. , 

20 1989, Current Protocols in Molecular Biology, (Green 
Publishing Associates and Wiley Interscience, N.Y.). 

Further, a previously unknown differentially 
expressed or pathway gene-type sequence can be isolated by 
performing PCR using two degenerate oligonucleotide primer 

25 pools designed on the basis of amino acid sequences within 
the gene of interest. The template for the reaction can be 
cDNA obtained by reverse transcription of mRNA prepared from 
human or non-human cell lines or tissue known or suspected 
to express a differentially expressed or pathway gene 

30 allele. The PCR product can be subcloned and sequenced to 
insure that the amplified sequences represent the sequences 
of a differentially expressed or pathway gene-like nucleic 
acid sequence. 



The PCR fragment can then be used to isolate a full 
length cDNA clone by a variety of methods. For example, 
the amplified fragment can be labeled and used to screen a 
bacteriophage cDNA library. Alternatively, the labeled 
5 fragment can be used to screen a genomic library. 

PCR technology can also be utilized to isolate full 
length cDNA sequences. For example, RNA can be isolated, 
following standard procedures, from an appropriate cellular 
or tissue source. A reverse transcription reaction can be 

10 performed on the RNA using an oligonucleotide primer 

specific for the most 5' end of the amplified fragment for 
the priming of first strand synthesis. The resulting 
RNA/DNA hybrid can then be "tailed" with guanines using a 
standard terminal transferase reaction, the hybrid can be 

15 digested with RNAase H, and second strand synthesis can then 
be primed with a poly-C primer. Thus, cDNA sequences 
upstream of the amplified fragment can easily be isolated. 
For a review of cloning strategies which can be used, see 
e.g. . Sambrook et al. , 1989, Molecular Cloning, A Laboratory 

20 Manual, Cold Springs Harbor Press, N.Y.; and Ausubel et al. , 
1989, Current Protocols in Molecular Biology, (Green 
Publishing Associates and Wiley Interscience, N.Y.). 

In cases where the differentially expressed or 
pathway gene identified is the normal, or wild type, gene, 

25 this gene can be used to isolate mutant alleles of the gene. 
Such an isolation is preferable in processes and disorders 
which are known or suspected to have a genetic basis. 
Mutant alleles can be isolated from individuals either known 
or suspected to have a genotype which contributes to tumor 

30 progression symptoms. Mutant alleles and mutant allele 
products can then be utilized in the therapeutic and 
diagnostic assay systems described below. 
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A cDNA of a mutant gene can be isolated, for 
example, by using PCR, a technique which is well-known to 
one skilled in the art. In this case, the first cDNA strand 
can be synthesized by hybridizing a oligo-dT oligonucleotide 
to mRNA isolated from tissue known or suspected of being 
expressed in an individual putatively carrying the mutant 
allele, and by extending the new strand with reverse 
transcriptase. The second strand of the cDNA can then be 
synthesized using an oligonucleotide that hybridizes 
specifically to the 5'- end of the normal gene. Using these 
two primers, the product is then amplified via PCR, cloned 
into a suitable vector, and subjected to DNA sequence 
analysis through methods well-known to one skilled in the 
art. By comparing the DNA sequence of the mutant gene to 
that of the normal gene, the mutation (s) responsible for the 
loss or alteration of function of the mutant gene product 
can be ascertained. 

Alternatively, a genomic or cDNA library can be 
constructed and screened using DNA or RNA, respectively, 
from a tissue known to or suspected of expressing the gene 
of interest in an individual suspected of or known to carry 
the mutant allele. The normal gene or any suitable fragment 
thereof can then be labeled and used as a probe to identify 
the corresponding mutant allele in the library. The clone 
containing this gene can then be purified through methods 
routinely practiced in the art, and subjected to sequence 
analysis as described, above, in this Section. 

Additionally, an expression library can be 
constructed utilizing DNA isolated from or cDNA synthesized 
from a tissue known to or suspected of expressing the gene 
of interest in an individual suspected of or known to carry 
the mutant allele. In this manner, gene products made by 
the putatively mutant tissue can be expressed and screened 



using standard antibody screening techniques in conjunction 
with antibodies raised against the normal gene product , as 
described, below, in Section 5.2.3. (For screening 
techniques, see, for example, Harlow, E. and Lane, eds., 
5 1988, "Antibodies: A Laboratory Manual," Cold Spring Harbor 
Press, Cold Spring Harbor.) In cases where the mutation 
results in an expressed gene product with altered function 
( e.g. , as a result of a missense mutation) , a polyclonal set 
of antibodies are likely to cross-react with the mutant gene 

10 product. Library clones detected via their reaction with 
such labeled antibodies can be purified and subjected to 
sequence analysis as described in this Section, above. 

Taking the fomy030 gene as an example, the fomy030 
human homolog can be isolated by a variety of methods. 

15 First, sequences found in a murine fomy030 cDNA can be 
utilized as hybridization probes to detect human fohy030 
sequences. This can be accomplished, for example, by 
probing Southern blots containing total human genomic DNA 
with a labelled fomy03 0 probe. Once it is verified that the 

20 probe being utilized detects the human 030 gene, one of 
skill in the art can employ any of several routine 
approaches to isolate the human gene without undue 
exper iment at ion . 

In one approach, appropriate human cDNA libraries 

25 can be screened. Such cDNA libraries can, for example, 

include human melanocyte, human retina and fetal human brain 
cDNA libraries. For example, panels of human melanoma cells 
(such as, for example, SK-MEL-2, ATCC 68-HTB; SK-MEL-5, ATCC 
70-HTB; SK-MEL-28, ATCC 72-HTB; G-361, ATCC 1424-CRL; and/or 

30 HT-144 [63-HTB] cells) can be screened for 030 expression 
by, for example, Northern blot analysis. Upon detection of 
030 transcript, cDNA libraries can be constructed from RNA 
isolated from the appropriate cell line, utilizing standard 



techniques well known to those of skill in the art. The 
human cDNA library can then be screened with a 030 probe in 
order to isolate a human romy030 cDNA. As described below, 
this method was used to determine the human fohy030 cDNAs in 
5 FIGS. 5 and 6. 

Alternatively, a human total genomic DNA library can 
be screened using 030 probes. 030-positive clones can then 
be sequenced and, further, the intron/exon structure of the 
human 030 gene may be elucidated. Once genomic sequence is 

10 obtained, oligonucleotide primers can be designed based on 
the sequence for use in the isolation, via, for example RT- 
coupled PCR, of human 030 cDNA. 

The procedures described in these approaches are 
routine and have been described in detail in Sections 

15 5.1.1.2, 5.3 and 5.7.2. 

5.5. DIFFERENTIALLY EXPRESSED 
AND PATHWAY GENE PRODUCTS 

Differentially expressed and pathway gene products 
include those proteins encoded by the differentially 

20 expressed and pathway gene sequences described in Section 
5.2.1, above, as for example, the peptide listed in FIG. 3. 
Specifically, differentially expressed and pathway gene 
products can include differentially expressed and pathway 
gene polypeptides encoded by the differentially expressed 

25 and pathway gene sequences contained in the clones listed in 
Table 2, above, as deposited with the NRRL, or contained in 
the coding regions of the genes to which DNA sequences 
disclosed herein (in FIGS. 3A, 3B, 5, and 6) or contained in 
the clones, listed in Table 2, as deposited with the NRRL, 

30 belong, for example. 

In addition, differentially expressed and pathway 
gene products can include proteins that represent 
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functionally equivalent gene products. Such an equivalent 
differentially expressed or pathway gene product can contain 
deletions, additions or substitutions of amino acid residues 
within the amino acid sequence encoded by the differentially 
5 expressed or pathway gene sequences described, above, in 
Section 5.2.1, but which result in a silent change thus 
producing a functionally equivalent differentially expressed 
on pathway gene product. Amino acid substitutions can be 
made on the basis of similarity in polarity, charge, 

10 solubility, hydrophobicity, hydrophilicity, and/or the 

amphipatic nature of the residues involved. For example, 
nonpolar (hydrophobic) amino acids include alanine, leucine, 
isoleucine, valine, proline, phenylalanine, tryptophan, and 
methionine; polar neutral amino acids include glycine, 

15 serine, threonine, cysteine, tyrosine, asparagine, and 

glutamine; positively charged (basic) amino acids include 
arginine, lysine, and histidine; and negatively charged 
(acidic) amino acids include aspartic acid and glutamic 
acid. "Functionally equivalent," as utilized herein, refers 

20 to either a protein capable exhibiting a substantially 

similar in vivo activity as the endogenous differentially 
expressed or pathway gene products encoded by the 
differentially expressed or pathway gene sequences described 
in Section 5.2.1, above. Alternatively, when utilized as 

25 part of assays such as those described, below, in Section 

5.3, "functionally equivalent" can refer to peptides capable 
of interacting with other cellular or extracellular 
molecules in a manner substantially similar to the way in 
which the corresponding portion of the endogenous 

30 differentially expressed or pathway gene product would. 

The differentially expressed or pathway gene 
products can be produced by synthetic techniques or via 
recombinant DNA technology using techniques well known in 



the art. Methods for preparing the differentially expressed 
or pathway gene polypeptides and peptides of the invention 
by expressing nucleic acid encoding differentially expressed 
or pathway gene sequences are described herein. Methods 
which are well known to those skilled in the art can be used 
to construct expression vectors containing differentially 
expressed or pathway gene protein coding sequences and 
appropriate transcriptional/translational control signals. 
These methods include, for example, in vitro recombinant DNA 
techniques, synthetic techniques and in vivo 
recombination/genetic recombination. See, for example, the 
techniques described in Maniatis et al. , 1989, Molecular 
Cloning A Laboratory Manual, Cold Spring Harbor Laboratory, 
N.Y. which is incorporated by reference herein in their 
entirety, and Ausubel, 1989, supra. Alternatively, RNA 
capable of encoding differentially expressed or pathway gene 
protein sequences can be chemically synthesized using, for 
example, synthesizers. See, for example, the techniques 
described in "Oligonucleotide Synthesis," 1984, Gait, M.J. 
ed., IRL Press, Oxford, which is incorporated by reference 
herein in its entirety. 

A variety of host-expression vector systems can be 
utilized to express the differentially expressed or pathway 
gene coding sequences of the invention. Such host- 
expression systems represent vehicles by which the coding 
sequences of interest can be produced and subsequently 
purified, but also represent cells which can, when 
transformed or transfected with the appropriate nucleotide 
coding sequences, exhibit the differentially expressed or 
pathway gene protein of the invention in situ. These 
include but are not limited to microorganisms such as 
bacteria ( e.g. , coli, L. subtilis) transformed with 
recombinant bacteriophage DNA, plasmid DNA or cosmid DNA 



expression vectors containing differentially expressed or 
pathway gene protein coding sequences; yeast (g T g t , 
Saccharomvces . Pichia ) transformed with recombinant yeast 
expression vectors containing the differentially expressed 
or pathway gene protein coding sequences; insect cell 
systems infected with recombinant virus expression vectors 
( e.g. . baculovirus) containing the differentially expressed 
or pathway gene protein coding sequences; plant cell systems 
infected with recombinant virus expression vectors (e.g. , 
cauliflower mosaic virus, CaMV; tobacco mosaic virus, TMV) 
or transformed with recombinant plasmid expression vectors 
( e.g. , Ti plasmid) containing differentially expressed or 
pathway gene protein coding sequences; or mammalian cell 
systems ( e.g. , COS, CH0, BHK, 293, 3T3) harboring 
recombinant expression constructs containing promoters 
derived from the genome of mammalian cells (e.g. , 
metallothionein promoter) or from mammalian viruses (e.g. , 
the adenovirus late promoter; the vaccinia virus 7.5K 
promoter) . 

In bacterial systems, a number of expression vectors 
can be advantageously selected depending upon the use 
intended for the differentially expressed or pathway gene 
protein being expressed. For example, when a large quantity 
of such a protein is to be produced, for the generation of 
antibodies or to screen peptide libraries, for example, 
vectors which direct the expression of high levels of fusion 
protein products that are readily purified can be desirable. 
Such vectors include, but are not limited, to the EL*, col± 
expression vector pUR278 (Ruther et al» , 1983, EMBO J. 
2:1791), in which the differentially expressed or pathway 
gene protein coding sequence can be ligated individually 
into the vector in frame with the lacZ coding region so that 
a fusion protein is produced; pIN vectors (Inouye & Inouye, 



1985, Nucleic Acids Res. 13.: 3101-3109; Van Heeke & Schuster, 
1989, J. Biol* Chem. 264:5503-5509); and the like. pGEX 
vectors can also be used to express foreign polypeptides as 
fusion proteins with glutathione S-transf erase (GST) . In 
5 general, such fusion proteins are soluble and can easily be 
purified from lysed cells by adsorption to glutathione- 
agarose beads followed by elution in the presence of free 
glutathione. The pGEX vectors are designed to include 
thrombin or factor Xa protease cleavage sites so that the 
10 cloned target gene protein can be released from the GST 
moiety. 

In an insect system, Autographa californica nuclear 
polyhedrosis virus (AcNPV) is used as a vector to express 
foreign genes. The virus grows in Spodoptera frugiperda 

15 cells. The differentially expressed or pathway gene coding 
sequence can be cloned individually into non-essential 
regions (for example the polyhedrin gene) of the virus and 
placed under control of an AcNPV promoter (for example, the 
polyhedrin promoter) . Successful insertion of 

20 differentially expressed or pathway gene coding sequence 
will result in inactivation of the polyhedrin gene and 
production of non-occluded recombinant virus ( i.e, , virus 
lacking the proteinaceous coat coded for by the polyhedrin 
gene) . These recombinant viruses are then used to infect 

25 Spodoptera frugiperda cells in which the inserted gene is 
expressed ( e.g. . see Smith et al. . 1983, J. Viol. 46:584; 
Smith, U.S. Patent No. 4,215,051). 

In mammalian host cells, a number of viral-based 
expression systems can be utilized. In cases where an 

30 adenovirus is used as an expression vector, the 

differentially expressed or pathway gene coding sequence of 
interest can be ligated to an adenovirus transcription/ 
translation control complex, e.g. , the late promoter and 



tripartite leader sequence. This chimeric gene can then be 
inserted in the adenovirus genome by in vitro or in vivo 
recombination. Insertion in a non-essential region of the 
viral genome ( e.g. . region El or E3) will result in a 
5 recombinant virus that is viable and capable of expressing 
differentially expressed or pathway gene protein in infected 
hosts f e.g. , See Logan & Shenk, 1984, Proc. Natl. Acad. Sci. 
USA 81:3655-3659). Specific initiation signals can also be 
required for efficient translation of inserted 

10 differentially expressed or pathway gene coding sequences. 
These signals include the ATG initiation codon and adjacent 
sequences. In cases where an entire identified gene, 
including its own initiation codon and adjacent sequences, 
is inserted into the appropriate expression vector, no 

15 additional translational control signals can be needed. 
However, in cases where only a portion of the identified 
coding sequence is inserted, exogenous translational control 
signals, including, perhaps, the ATG initiation codon, must 
be provided. Furthermore, the initiation codon must be in 

20 phase with the reading frame of the desired coding sequence 
to ensure translation of the entire insert. These exogenous 
translational control signals and initiation codons can be 
of a variety of origins, both natural and synthetic. The 
efficiency of expression can be enhanced by the inclusion of 

25 appropriate transcription enhancer elements, transcription 
terminators, etc., (see Bittner et al. . 1987, Methods in 
Enzymol. 153:516-544) . 

In addition, a host cell strain can be chosen which 
modulates the expression of the inserted sequences, or 

30 modifies and processes the gene product in the specific 

fashion desired. Such modifications ( e.g. , glycosylation) 
and processing ( e.g. , cleavage) of protein products can be 
important for the function of the protein. Different host 



10 



cells have characteristic and specific mechanisms for the 
post-translational processing and modification of proteins. 
Appropriate cell lines or host systems can be chosen to 
ensure the correct modification and processing of the 
foreign protein expressed. To this end, eukaryotic host 
cells which possess the cellular machinery for proper 
processing of the primary transcript, glycosylation, and 
phosphorylation of the gene product can be used. Such 
mammalian host cells include but are not limited to CHO, 
VERO, BHK, HeLa, COS, MDCK, 293, 3T3, WI38, etc. 

For long-term, high-yield production of recombinant 
proteins, stable expression is preferred. For example, cell 
lines which stably express the differentially expressed or 
pathway gene protein can be engineered. Rather than using 
15 expression vectors which contain viral origins of 
replication, host cells can be transformed with DNA 
controlled by appropriate expression control elements (e^., 
promoter, enhancer, sequences, transcription terminators, 
polyadenylation sites, etc.), and a selectable marker. 
20 Following the introduction of the foreign DNA, engineered 
cells can be allowed to grow for 1-2 days in an enriched 
media, and then are switched to a selective media. The 
selectable marker in the recombinant plasmid confers 
resistance to the selection and allows cells to stably 
25 integrate the plasmid into their chromosomes and grow to 

form foci which in turn can be cloned and expanded into cell 
lines. This method can advantageously be used to engineer 
cell lines which express the identified gene protein, such 
engineered cell lines can be particularly useful in 
30 screening and evaluation of compounds that affect the 
endogenous activity of the differentially expressed or 
pathway gene protein. 
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A number of selection systems can be used, 
including, but not limited to, the herpes simplex virus 
thymidine kinase (Wigler, et ah , 1977, Cell 11:223), 
hypoxanthine-guanine phosphor ibosyltransf erase (Szybalska & 
5 Szybalski, 1962, Proc. Natl. Acad. Sci. USA 48:2026), and 
adenine phosphoribosyltransf erase (Lowy, et al. . 1980, Cell 
22:817) genes in tk~, hgprt" or aprt" cells, respectively. 
Also, antimetabolite resistance can be used as the basis of 
selection for dhfr, which confers resistance to methotrexate 

10 (Wigler, et al. , 1980, Natl. Acad. Sci. USA 77:3567; O'Hare, 
et al. , 1981, Proc. Natl. Acad. Sci. USA 78:1527); gpt, 
which confers resistance to mycophenolic acid (Mulligan & 
Berg, 1981, Proc. Natl. Acad. Sci. USA 78 : 2072) ; neo, which 
confers resistance to the aminoglycoside G-418 (Colberre- 

15 Garapin, et al. . 1981, J. Mol. Biol. 15£:1); and hygro, 
which confers resistance to hygromycin (Santerre, et al. , 
1984, Gene 30:147) genes. 

An alternative fusion protein system allows for the 
ready purification of non-denatured fusion proteins 

20 expressed in human cell lines (Janknecht, et al. , 1991, 

Proc. Natl. Acad. Sci. USA 88:8972-8976). In this system, 
the gene of interest is subcloned into a vaccinia 
recombination plasmid such that the gene's open reading 
frame is translationally fused to an amino-terminal tag 

25 consisting of six histidine residues. Extracts from cells 
infected with recombinant vaccinia virus are loaded onto 
ni2+ nitriloacetic acid-agarose columns and histidine-tagged 
proteins are selectively eluted with imidazole-containing 
buffers. 

3 0 When used as a component in assay systems such as 

that described herein, the differentially expressed or 
pathway gene protein can be labeled, either directly or 
indirectly, to facilitate detection of a complex formed 



between the differentially expressed or pathway gene protein 
and a test substance. Any of a variety of suitable labeling 
systems can be used including but not limited to 
radioisotopes such as 125 I; enzyme labelling systems that 
generate a detectable colorimetric signal or light when 
exposed to substrate; and fluorescent labels. 

Where recombinant DNA technology is used to produce 
the differentially expressed or pathway gene protein for 
such assay systems, it can be advantageous to engineer 
fusion proteins that can facilitate labeling, solubility, 
immobilization and/or detection. 

Indirect labeling involves the use of a third 
protein, such as a labeled antibody, which specifically 
binds to either a differentially expressed or pathway gene 
product. Such antibodies include but are not limited to 
polyclonal, monoclonal, chimeric, single chain, Fab 
fragments and fragments produced by a Fab expression 
library. 

5.6. ANTIBODIES SPECIFIC FOR 
DIFFERENTIALLY EXPRESSED 
OR PATHWAY GENB PRODUCTS 

Described herein are methods for the production of 

antibodies capable of specifically recognizing one or more 

differentially expressed or pathway gene epitopes. Such 

antibodies can include, but are not limited to polyclonal 

antibodies, monoclonal antibodies (mAbs) , humanized or 

chimeric antibodies, single chain antibodies, Fab fragments, 

F(ab'>2 fragments, fragments produced by a FAb expression 

library, anti-idiotypic (anti-Id) antibodies, and epitope- 

binding fragments of any of the above. Such antibodies can 

be used, for example, in the detection of a fingerprint, 

target, or pathway gene in a biological sample, or, 
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alternatively, as a method for the inhibition of abnormal 
target gene activity. Thus, such antibodies can be utilized 
as tumor progression treatment methods, and/or can be used 
as part of diagnostic techniques whereby patients can be 
tested for abnormal levels of fingerprint, target, or 
pathway gene proteins, or for the presence of abnormal forms 
of the such proteins. 

For the production of antibodies to a differentially 
expressed or pathway gene, various host animals can be 
immunized by injection with a differentially expressed or 
pathway gene protein, or a portion thereof. Such host 
animals can include but are not limited to rabbits, mice, 
and rats, to name but a few. Various adjuvants can be used 
to increase the immunological response, depending on the 
host species, including but not limited to Freund's 
(complete and incomplete) , mineral gels such as aluminum 
hydroxide, surface active substances such as lysolecithin, 
pluronic polyols, polyanions, peptides, oil emulsions, 
keyhole limpet hemocyanin, dinitrophenol, and potentially 
useful human adjuvants such as BCG (bacille Calmette-Guerin) 
and Corynebacterium parvum. 

Polyclonal antibodies are heterogeneous populations 
of antibody molecules derived from the sera of animals 
immunized with an antigen, such as target gene product 
( e.g. , protein encoded by 030) , or an antigenic functional 
derivative thereof. For the production of polyclonal 
antibodies, host animals such as those described above, can 
be immunized by injection with differentially expressed or 
pathway gene product (e,^, 030) supplemented with adjuvants 
as also described above. 

Monoclonal antibodies, which are homogeneous 
populations of antibodies to a particular antigen, can be 
obtained by any technique which provides for the production 



of antibody molecules by continuous cell lines in culture. 
These include , but are not limited to the hybridoma 
technique of Kohler and Milstein, (1975, Nature 21£: 495-497; 
and U.S. Patent No. 4,376,110), the human B-cell hybridoma 
5 technique (Kosbor et al . . 1983, Immunology Today 4:72; Cole 
et al. , 1983, Proc. Natl. Acad. Sci. USA 80 : 2026-2030) , and 
the BV-hybridoma technique (Cole et al. , 1985, Monoclonal 
Antibodies And Cancer Therapy, Alan R. Liss, Inc., pp. 77- 
96) . Such antibodies can be of any immunoglobulin class 
10 including IgG, IgM, IgE, IgA, IgD and any subclass thereof. 
The hybridoma producing the mAb of this invention can be 
cultivated in vitro or in vivo > Production of high titers 
of mAbs in vivo makes this the presently preferred method of 
production. 

15 in addition, techniques developed for the production 

of "chimeric antibodies" (Morrison et al. , 1984, Proc. Natl. 
Acad. Sci., 81:6851-6855; Neuberaer et al. . 1984, Nature, 
312:604-608; Takeda et al. , 1985, Nature, 314:452-454; U.S. 
Patent No. 4,816,567) by splicing the genes from a mouse 

20 antibody molecule of appropriate antigen specificity 
together with genes from a human antibody molecule of 
appropriate biological activity can be used. A chimeric 
antibody is a molecule in which different portions are 
derived from different animal species, such as those having 

25 a variable region derived from a murine mAb and a human 
immunoglobulin constant region. 

Alternatively, techniques described for the 
production of single chain antibodies (U.S. Patent 
4,946,778; Bird, 1988, Science 242:423-426; Huston et aj t , 

30 1988, Proc. Natl. Acad. Sci. USA 85:5879-5883; and Ward e£ 
al. , 1989, Nature 334 :544-546) and for making humanized 
monoclonal antibodies (U.S. Patent No. 5,225,539, which is 
incorporated herein by reference in its entirety) can be 



utilized to produce anti-dif f erentially expressed or anti- 
pathway gene product antibodies. 

Antibody fragments which recognize specific epitopes 
can be generated by known techniques. For example , such 
5 fragments include but are not limited to: the F(ab') 2 

fragments which can be produced by pepsin digestion of the 
antibody molecule and the Fab fragments which can be 
generated by reducing the disulfide bridges of the F(ab') 2 
fragments. Alternatively, Fab expression libraries can be 
10 constructed (Huse et aL f 1989, Science, 246:1275-1281) to 
allow rapid and easy identification of monoclonal Fab 
fragments with the desired specificity. 

S.7. CELL- AND ANIMAL-BASED MODEL SYSTEMS 

Described herein are cell- and animal-based systems 

15 which represent reliable models for tumor progression 
disorders. These systems can be used in a variety of 
applications. For example, the cell- and animal-based model 
systems can be used to identify differentially expressed 
genes via the paradigms described, above, in Section 

20 5.1.1.1. Such systems can also be used to further 

characterize differentially expressed and pathway genes, as 
described, above, in Section 5.3. Such further 
characterization can, for example, indicate that a 
differentially expressed gene is a target gene, for example. 

25 Additionally, such assays can be utilized as part of 

screening strategies designed to identify compounds which 
are capable of preventing and/or ameliorating symptoms of 
tumor progression disorders, including those associated with 
metastatic diseases, as described, below. Thus, the animal- 

30 and cell-based models can be used to identify drugs, 

pharmaceuticals, therapies and interventions which can be 
effective in treating tumor progression disorders, such as, 



for example, metastatic diseases. In addition, as described 
in detail, below, in Section 5.10.1, such animal models can 
be used to determine the LD 50 and the ED 50 in animal 
subjects, and such data can be used to determine the in vivo 
5 efficacy of potential anti-tumor progression disorder 
treatments . 

5.7.1. ANIMAL-BASED SYSTEMS 
Animal-based model systems of tumor progression 
disorders can be both non-recombinant animals as well as 

10 recombinantly engineered transgenic animals. 

Non-recombinant animal models for tumor progression 
can include, for example, murine models of melanoma, 
prostate cancer and colon cancer. Such models may be 
generated, for example, by introducing tumor cells into 

15 syngeneic mice using techniques such as subcutaneous 
injection, tail vein injection, spleen implantation, 
intraperitoneal implantation, implantation under the renal 
capsule or orthotopic implantation (e^a. , colon cancer cells 
implanted in colonic tissue or prostatic cancer cells 

20 implanted in prostate gland) . After an appropriate period 
of time, the tumors which result from these injections can 
be counted and analyzed. 

Among the cells which may be used for the production 
of such animal models of tumor progression are cells derived 

25 from the cell lines listed, above, in Section 5.1.1.1. For 
example, Bl 6 melanoma cells (Fidler, I.J., 1973, Nature New 
Biol. 212:148-149), including cell variants exhibiting high 
( e.g. . B16 F10 cells) and low fe.q. . B16 Fl cells) 
metastatic potential may be utilized. Post-injection, 

30 pulmonary tumors generally develop in the mouse models. 
Thus, these animal serve as models of not only melanoma 
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tumor progression but also as models of pulmonary 
metastases . 

For the generation of animal models of colorectal 
cancers, colon cancer cells such as, for example, KMl2c (low 
metastatic potential) and KM12L4 (highly metastatic) cells 
(Morikawa, K. et al. . 1988, Cancer Research 41:1943-1948) 
can be implanted into nude mice spleens. In these cases, 
the animals generally develop hepatic tumors. Thus, such 
animals serve as models of not only colorectal tumor 
progression but also as models of hepatic metastases. 

For the generation of animal models of prostate 
cancer tumor progression, cells derived from, for example, 
the high metastatic potential prostatic cell line PC-3-M or 
the non-metastatic cell line DU 145 (Karmali, R.A. et al. , 
1987, Anticancer Res. 7:1173-1180; Koziowski, J.M. et al. , 
1984, Cancer Research 44:3522-3529) may be implanted into 
the prostates of animals and the resulting tumors may be 
analyzed and compared to, for example, normal tissue. In 
such a manner, genes which are differentially expressed in 
neoplastic versus normal cells as well as versus metastatic 
cells may be identified. 

The role of identified gene products (e.g. , 030 gene 
products) can be determined by transfecting cDNAs encoding 
such gene products into the appropriate cell line and 
analyzing its effect on the cells' ability to induce tumor 
progression in animal models such as these. The role of the 
identified gene products may be further analyzed by, for 
example, culturing cells derived from the tumors which 
develop in the animal models, introducing these cultured 
cells into animals, and subsequently measuring the level of 
identified gene product present in the resulting tumor 
cells. In this manner, cell line variants are developed 
which can be useful in analyzing the role of quantitative 



and/or qualitative differences in the expression of the 
identified genes on the cells' ability to induce tumor 
progression. For example, as demonstrated, below, in the 
Example presented in Section 6, 030 gene expression is 
5 inversely related to the metastatic potential of the tumor 
cell line used to generate such a tumor progression animal 
model. 

Additionally, recombinant animal models exhibiting 
tumor progression characteristics and/or symptoms of tumor 

10 progression disorders, including metastatic diseases, can be 
utilized, for example, such well-known animal models as the 
transgenic mouse model for human melanoma and transgenic 
mice which carry specific mutations which result in multiple 
intestinal tumors (Mintz, M. and Silvers W.K., 1993, Proc. 

15 Natl. Acad. Sci. USA 90:8817-8821; and Fodde, R. , et al., 
1994, Proc. Natl. Acad. Sci. USA 91 : 8969-8973 ) . Further, 
recombinant animal models for tumor progression can be 
engineered by utilizing, for example, target gene sequences 
such as those described, above, in Section 5.4, in 

20 conjunction with techniques for producing transgenic animals 
that are well known to those of skill in the art. For 
example, target gene sequences can be introduced into, and 
overexpressed in, the genome of the animal of interest, or, 
if endogenous target gene sequences are present, they can 

25 either be overexpressed or, alternatively, can be disrupted 
in order to underexpress or inactivate target gene 
expression. 

In order to overexpress a target gene sequence, the 
coding portion of the target gene sequence can be ligated to 
3 0 a regulatory sequence which is capable of driving gene 

expression in the animal and cell type of interest. Such 
regulatory regions will be well known to those of skill in 
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the art, and can be utilized in the absence of undue 
experimentation . 

In order to obtain underexpression of an endogenous 
target gene sequence, such a sequence can be introduced into 
5 the genome of the animal of interest such that the 
endogenous target gene alleles will be inactivated. 
Preferably , an engineered sequence comprising at least part 
of the target gene sequence is utilized and is introduced, 
via gene targeting, such that the endogenous target sequence 

10 is disrupted upon integration of the engineered target gene 
sequence into the animal's genome. Gene targeting is 
discussed, below, in this Section. 

Animals of any species, including, but not limited 
to, mice, rats, rabbits, guinea pigs, pigs, micro-pigs, 

15 goats, and non-human primates, e.g. , baboons, monkeys, and 
chimpanzees can be used to generate animal models of tumor 
progression and tumor progression disorders, such as, for 
example, metastatic diseases. 

Any technique known in the art can be used to 

20 introduce a target gene transgene into animals to produce 
the founder lines of transgenic animals. Such techniques 
include, but are not limited to pronuclear microinjection 
(Hoppe, P.C. and Wagner, T.E., 1989, U.S. Pat. No. 
4,873,191); retrovirus mediated gene transfer into germ 

25 lines (Van der Putten et al. , 1985, Proc. Natl. Acad. Sci., 
USA 82:6148-6152); gene targeting in embryonic stem cells 
(Thompson et al. . 1989, Cell 56:313-321); electroporation of 
embryos (Lo, 1983, Mol Cell. Biol. 3:1803-1814); and sperm- 
mediated gene transfer (Lavitrano et ah . 1989, Cell 57:717- 

30 723); etc. For a review of such techniques, see Gordon, 
1989, Transgenic Animals, Intl. Rev. Cytol. 115:171-229, 
which is incorporated by reference herein in its entirety. 
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The present invention provides for transgenic 
animals that carry the transgene in all their cells, as well 
as animals which carry the transgene in some, but not all 
their cells, i.e. , mosaic animals. The transgene can be 
5 integrated, either as a single transgene or in concatamers, 
e.g. , head-to-head tandems or head-to-tail tandems. The 
transgene can also be selectively introduced into and 
activated in a particular cell type by following, for 
example, the teaching of Lasko et al. (Lasko, M., et al. , 

10 1992, Proc. Natl. Acad. Sci. USA 89 : 6232-6236) . The 

regulatory sequences required for such a cell-type specific 
activation will depend upon the particular cell type of 
interest, and will be apparent to those of skill in the art. 

When it is desired that the target gene transgene be 

15 integrated into the chromosomal site of the endogenous 
target gene, gene targeting is preferred. Briefly, when 
such a technique is to be utilized, vectors containing some 
nucleotide sequences homologous to the endogenous target 
gene of interest are designed for the purpose of 

20 integrating, via homologous recombination with chromosomal 
sequences, into and disrupting the function of, the 
nucleotide sequence of the endogenous target gene. The 
transgene can also be selectively introduced into a 
particular cell type, thus inactivating the endogenous gene 

25 of interest in only that cell type, by following, for 

example, the teaching of Gu et al. (Gu, H. , et al. , 1994, 
Science 265 :103-106) . The regulatory sequences required for 
such a cell-type specific inactivation will depend upon the 
particular cell type of interest, and will be apparent to 

30 those of skill in the art. 

Once transgenic animals have been generated, the 
expression of the recombinant target gene and protein can be 
assayed utilizing standard techniques. Initial screening can 



be accomplished by Southern blot analysis or PCR techniques 
to analyze animal tissues to assay whether integration of 
the transgene has taken place. The level of mRNA expression 
of the transgene in the tissues of the transgenic animals 
5 can also be assessed using techniques which include, but are 
not limited to, Northern blot analysis of tissue samples 
obtained from the animal, in s%%\* hybridization analysis, 
and RT-coupled PCR. Samples of target gene-expressing 
tissue, can also be evaluated immunocytochemically using 

10 antibodies specific for the transgenic product of interest. 

The target gene transgenic animals that express 
target gene mRNA or target gene transgene peptide (detected 
immunocytochemically, using antibodies directed against 
target gene product epitopes) at easily detectable levels 

15 should then be further evaluated to identify those animals 
which display tumor progression state characteristics, 
including tumor progression disorder symptoms. Such tumor 
progression disorder characteristics and/or symptoms can 
include, for example, those associated with such tumor cells 

20 as found in human melanoma, breast, gastrointestinal, such 
as esophageal, stomach, colon, bowel, colorectal and rectal 
cancers, prostate, bladder, testicular, ovarian, uterine, 
cervical, brain, lung, bronchial, larynx, pharynx, liver, 
pancreatic, thyroid, bone, leukemias, lymphomas and various 

25 types of skin cancers. 

Additionally, specific cell types within the 
transgenic animals can be analyzed for cellular phenotypes 
characteristic of tumor progression. Such cellular 
phenotypes can include, for example, differential gene 

30 expression characteristic of cells within a given tumor 
progression state of interest. Further, such cellular 
phenotypes can include as assessment of a particular cell 
type fingerprint pattern of expression and its comparison tc 
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known fingerprint expression profiles of the particular cell 
type in animals exhibiting tumor progression. Such 
transgenic animals serve as suitable model systems for tumor 
progression disorders. 
5 Once target gene transgenic founder animals are 

produced ( i.e. . those animals which express target gene 
proteins in cells or tissues of interest, and which, 
preferably, exhibit tumor progression characteristics) , they 
can be bred, inbred, outbred, or crossbred to produce 

10 colonies of the particular animal. Examples of such 
breeding strategies include but are not limited to: 
outbreeding of founder animals with more than one 
integration site in order to establish separate lines; 
inbreeding of separate lines in order to produce compound 

15 target gene transgenics that express the target gene 
transgene of interest at higher levels because of the 
effects of additive expression of each target gene 
transgene; crossing of heterozygous transgenic animals to 
produce animals homozygous for a given integration site in 

20 order to both augment expression and eliminate the possible 
need for screening of animals by DNA analysis; crossing of 
separate homozygous lines to produce compound heterozygous 
or homozygous lines; breeding animals to different inbred 
genetic backgrounds so as to examine effects of modifying 

25 alleles on expression of the target gene transgene and the 
development of symptoms for tumor progression disorders. 
One such approach is to cross the target gene transgenic 
founder animals with a wild type strain to produce an Fl 
generation that exhibits symptoms for tumor progression 

3 0 disorders. The Fl generation can then be inbred in order to 
develop a homozygous line, if it is found that homozygous 
target gene transgenic animals are viable. 
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5.7.2. CELL-BASED ASSAY8 

Cells that contain and express target gene sequences 
which encode target gene protein, and, further, exhibit 
cellular phenotypes associated with tumor progression 
5 disorders, can be utilized to identify compounds that 
exhibit an ability to prevent and/or ameliorate tumor 
progression. Cellular phenotypes which can indicate an 
ability to ameliorate symptoms of tumor progression 
disorders can include, for example, tumor cells with low or 

10 high metastatic potential. 

Further, the fingerprint pattern of gene expression 
of cells of interest can be analyzed and compared to the 
normal fingerprint pattern. Those compounds which cause 
cells exhibiting cellular phenotypes of tumor progression 

15 disorders, including metastatic diseases, to produce a 
fingerprint pattern more closely resembling a normal 
fingerprint pattern for the cell of interest can be 
considered candidates for further testing regarding an 
ability to ameliorate the symptoms of such diseases. 

20 Cells which will be utilized for such assays can, 

for example, include non-recombinant cell lines, such as, 
but not limited to, melanoma r e. a. , B16 Fl and B16 F10 cell 
lines) , human colon ( e.g. . KM12c and KM20L4 cell lines) , 
prostate ( e.g. . DU 145 and PC-3-M cell lines) and breast 

25 cancer cell lines ( e.g. . MCF-7 and MDA-MB-435 cell lines) . 
In addition, purified primary or secondary tumor cells 
derived from either transgenic or non-transgenic tumor cells 
can be used. 

Further, cells which can be used for such assays can 
30 also include recombinant, transgenic cell lines. For 
example, the metastatic disease animal models of the 
invention, discussed, above, in Section 5.2.4.1, can be used 
to generate cell lines, containing one or more cell types 



involved in metastatic diseases, that can be used as cell 
culture models for these disorders. While primary cultures 
derived from the metastasis in transgenic animals of the 
invention can be utilized, the generation of continuous cell 
5 lines is preferred. For examples of techniques which can be 
used to derive a continuous cell line from the transgenic 
animals, see Small et al. , 1985, Mol. Cell Biol. 5:642-648. 

Alternatively, cells of a cell type known to be 
involved in metastatic diseases can be transfected with 

10 sequences capable of increasing or decreasing the amount of 
target gene expression within the cell. For example, target 
gene sequences can be introduced into, and over expressed 
in, the genome of the cell of interest, or, if endogenous 
target gene sequences are present, they can either be 

15 overexpressed or, alternatively, be disrupted in order to 
underexpress or inactivate target gene expression. 

In order to overexpress a target gene sequence, the 
coding portion of the target gene sequence can be ligated to 
a regulatory sequence which is capable of driving gene 

20 expression in the cell type of interest. Such regulatory 

regions will be well known to those of skill in the art, and 
can be utilized in the absence of undue experimentation. 

For under expression of an endogenous target gene 
sequence, such a sequence can be isolated and engineered 

25 such that when reintroduced into the genome of the cell type 
of interest, the endogenous target gene alleles will be 
inactivated. Preferably, the engineered target gene 
sequence is introduced via gene targeting such that the 
endogenous target sequence is disrupted upon integration of 

30 the engineered target gene sequence into the cell's genome. 
Gene targeting is discussed, above, in Section 5.7.1. 

Transfection of target gene sequence nucleic acid 
can be accomplished by utilizing standard techniques. See, 
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for example, Ausubel, 1989, supra . Transfected cells should 
be evaluated for the presence of the recombinant target gene 
sequences, for expression and accumulation of target gene 
mRNA, and for the presence of recombinant target gene 
5 protein production. In instances wherein a decrease in 
target gene expression is desired, standard techniques can 
be used to demonstrate whether a decrease in endogenous 
target gene expression and/ or in target gene product 
production is achieved* 

10 5,8. SCREENING ASSAYS FOR COMPOUNDS THAT 

INTERACT WITH THE TARGET GENE PRODUCT 

The following assays are designed to identify 

compounds that bind to target gene products, bind to other 

cellular proteins that interact with a target gene product, 

15 and to compounds that interfere with the interaction of the 
target gene product with other cellular proteins. 

Such compounds can include, but are not limited to, 
other cellular proteins* Specifically, such compounds can 
include, but are not limited to, peptides, such as, for 

20 example, soluble peptides, including, but not limited to Ig- 
tailed fusion peptides, comprising extracellular portions of 
target gene product transmembrane receptors, and members of 
random peptide libraries (see, e.g. , Lam, K.S. et al. , 1991, 
Nature 354:82-84; Houghton, R. et al. . 1991, Nature 354:84- 

25 86) , made of D-and/or L-conf iguration amino acids, 

phosphopeptides (including, but not limited to, members of 
random or partially degenerate phosphopeptide libraries; 
see, e.g. , Songyang, Z. et al. , 1993, Cell 72:767-778), 
antibodies (including, but not limited to, polyclonal, 

30 monoclonal, humanized, anti-idiotypic, chimeric or single 

chain antibodies, and FAb, F(ab / ) 2 and FAb expression libary 
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fragments, and epitope-binding fragments thereof) , and small 
organic or inorganic molecules. 

Compounds identified via assays such as those 
described herein can be useful, for example, in elaborating 
5 the biological function of the target gene product, and for 
ameliorating symptoms of tumor progression. In instances, 
for example, whereby a tumor progression state or disorder 
results from a lower overall level of target gene 
expression, target gene product, and/ or target gene product 

10 activity in a cell involved in the tumor progression state 
or disorder, compounds that interact with the target gene 
product can include ones which accentuate or amplify the 
activity of the bound target gene protein. Such compounds 
would bring about an effective increase in the level of 

15 target gene activity, thus ameliorating symptoms of the 

tumor progression disorder or state. In instances whereby 
mutations within the target gene cause aberrant target gene 
proteins to be made which have a deleterious effect that 
leads to tumor progression, compounds that bind target gene 

20 protein can be identified that inhibit the activity of the 
bound target gene protein. Assays for testing the 
effectiveness of compounds, identified by, for example, 
techniques such as those described in Section 5.8.1-5.8.3, 
are discussed, below, in Section 5.8.4. 

25 5.8.1. IK VITRO SCREENING ASSAYS POR COMPOUNDS 

THAT BIND TO A TARG ET GENE PRODUCT 

In vitro systems can be designed to identify 

compounds capable of binding the target gene products of the 

invention. Compounds identified can be useful, for example, 

30 in modulating the activity of wild type and/ or mutant target 

gene products, preferably mutant target gene proteins, can 

be useful in elaborating the biological function of the 
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target gene product, can be utilized in screens for 
identifying compounds that disrupt normal target gene 
interactions, or can in themselves disrupt such 
interactions* 

5 The principle of the assays used to identify 

compounds that bind to the target gene product involves 
preparing a reaction mixture of the target gene protein and 
the test compound under conditions and for a time sufficient 
to allow the two components to interact and bind, thus 

10 forming a complex which can be removed and/or detected in 
the reaction mixture. These assays can be conducted in a 
variety of ways* For example, one method to conduct such an 
assay would involve anchoring target gene product or the 
test substance onto a solid phase and detecting target gene 

15 product/test compound complexes anchored on the solid phase 
at the end of the reaction. In one embodiment of such a 
method, the target gene product can be anchored onto a solid 
surface, and the test compound, which is not anchored, can 
be labeled, either directly or indirectly. 

20 In practice, microtitre plates can conveniently be 

utilized as the solid phase. The anchored component can be 
immobilized by non-covalent or covalent attachments. Non- 
covalent attachment can be accomplished by simply coating 
the solid surface with a solution of the protein and drying. 

25 Alternatively, an immobilized antibody, preferably a 
monoclonal antibody, specific for the protein to be 
immobilized can be used to anchor the protein to the solid 
surface. The surfaces can be prepared in advance and 
stored. 

3 0 In order to conduct the assay, the nonimmobilized 

component is added to the coated surface containing the 
anchored component. After the reaction is complete, 
unreacted components are removed ( e.g. . by washing) under 



conditions such that any complexes formed will remain 
immobilized on the solid surface. The detection of 
complexes anchored on the solid surface can be accomplished 
in a number of ways. Where the previously immobilized 
5 component is pre-labeled, the detection of label immobilized 
on the surface indicates that complexes were formed, where 
the previously nonimmobilized component is not pre-labeled, 
an indirect label can be used to detect complexes anchored 
on the surface; e.g. . using a labeled antibody specific for 

10 the immobilized component (the antibody, in turn, can be 

directly labeled or indirectly labeled with a labeled anti- 
Ig antibody) . 

Alternatively, a reaction can be conducted in a 
liquid phase, the reaction products separated from unreacted 

15 components, and complexes detected; e.g. . using an 

immobilized antibody specific for target gene or the test 
compound to anchor any complexes formed in solution, and a 
labeled antibody specific for the other component of the 
possible complex to detect anchored complexes. 

20 5.8.2. ASSAYS FOR CELLULAR PROTEINS THAT 

TNT BRACT WITH THE TARGET GENE PRODUCT 

Any method suitable for detecting protein-protein 

interactions can be employed for identifying novel target 

product-cellular or extracellular protein interactions. 

25 These methods are outlined in Section 5.1.3., supra, for the 

identification of pathway genes, and can be utilized herein 

with respect to the identification of proteins which 

interact with identified target proteins. In such a case, 

the target gene serves as the known "bait" gene. 
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5.8.3. ASSAYS FOR COMPOUNDS THAT 
INTERFERE WITH TARGET 
GENE /CELLULAR PRODUCT INTERACTION 

The target gene products of the invention can, in 

5 vivo , interact with one or more cellular or extracellular 

macromolecules , such as proteins. Such macromolecules 

include, but are not limited to, nucleic acid molecules and 

those products identified via methods such as those 

described, above, in Section 5.8.2. For the purposes of 

10 this discussion, such cellular and extracellular 

macromolecules are referred to herein as "binding partners." 
Compounds that disrupt such interactions can be useful in 
regulating the activity of the target gene product, 
especially mutant target gene products. Such compounds can 

15 include, but are not limited to molecules such as 

antibodies, peptides, and the like described in Section 
5.3.1. above. 

The basic principle of the assay systems used to 
identify compounds that interfere with the interaction 

20 between the target gene product and its cellular or 

extracellular binding partner or partners involves preparing 
a reaction mixture containing the target gene product, and 
the binding partner under conditions and for a time 
sufficient to allow the two products to interact and bind, 

25 thus forming a complex. In order to test a compound for 

inhibitory activity, the reaction mixture is prepared in the 
presence and absence of the test compound. The test 
compound can be initially included in the reaction mixture, 
or can be added at a time subsequent to the addition of 

30 target gene and its cellular or extracellular binding 

partner. Control reaction mixtures are incubated without 
the test compound or with a placebo. The formation of any 
complexes between the target gene product and the cellular 



- 76 - 



or extracellular binding partner is then detected. The 
formation of a complex in the control reaction, but not in 
the reaction mixture containing the test compound, indicates 
that the compound interferes with the interaction of the 
5 target gene product and the interactive binding partner. 
Additionally, complex formation within reaction mixtures 
containing the test compound and normal target gene product 
can also be compared to complex formation within reaction 
mixtures containing the test compound and mutant target gene 

10 product. This comparison can be important in those cases 
wherein it is desirable to identify compounds that disrupt 
interactions of mutant but not normal target gene products. 

The assay for compounds that interfere with the 
interaction of the target gene products and binding partners 

15 can be conducted in a heterogeneous or homogeneous format. 
Heterogeneous assays involve anchoring either the target 
gene product or the binding partner onto a solid phase and 
detecting complexes anchored on the solid phase at the end 
of the reaction. In homogeneous assays, the entire reaction 

20 is carried out in a liquid phase. In either approach, the 
order of addition of reactants can be varied to obtain 
different information about the compounds being tested. For 
example, test compounds that interfere with the interaction 
between the target gene products and the binding partners, 

25 e.g. . by competition, can be identified by conducting the 
reaction in the presence of the test substance; j^e^, by 
adding the test substance to the reaction mixture prior to 
or simultaneously with the target gene product and 
interactive cellular or extracellular binding partner. 

30 Alternatively, test compounds that disrupt preformed 

complexes, e.g. compounds with higher binding constants that 
displace one of the components from the complex, can be 
tested by adding the test compound to the reaction mixture 



after complexes have been formed. The various formats are 
described briefly below. 

In a heterogeneous assay system, either the target 
gene product or the interactive cellular or extracellular 
5 binding partner, is anchored onto a solid surface, while the 
non-anchored species is labeled, either directly or 
indirectly. In practice, microtitre plates are conveniently 
utilized. The anchored species can be immobilized by non- 
covalent or covalent attachments. Non-covalent attachment 

10 can be accomplished simply by coating the solid surface with 
a solution of the target gene product or binding partner and 
drying. Alternatively, an immobilized antibody specific for 
the species to be anchored can be used to anchor the species 
to the solid surface. The surfaces can be prepared in 

15 advance and stored. 

In order to conduct the assay, the partner of the 
immobilized species is exposed to the coated surface with or 
without the test compound. After the reaction is complete, 
unreacted components are removed ( e.g. . by washing) and any 

20 complexes formed will remain immobilized on the solid 

surface. The detection of complexes anchored on the solid 
surface can be accomplished in a number of ways. Where the 
non- immobilized species is pre-labeled, the detection of 
label immobilized on the surface indicates that complexes 

25 were formed. Where the non- immobilized species is not pre- 
labeled, an indirect label can be used to detect complexes 
anchored on the surface; e.g. , using a labeled antibody 
specific for the initially non- immobilized species (the 
antibody, in turn, can be directly labeled or indirectly 

3 0 labeled with a labeled anti-Ig antibody) . Depending upon 

the order of addition of reaction components, test compounds 
which inhibit complex formation or which disrupt preformed 
complexes can be detected. 



Alternative ly, the reaction can be conducted in a 
liquid phase in the presence or absence of the test 
compound , the reaction products separated from unreacted 
components, and complexes detected; e.g. . using an 
5 immobilized antibody specific for one of the binding 

components to anchor any complexes formed in solution, and a 
labeled antibody specific for the other partner to detect 
anchored complexes* Again, depending upon the order of 
addition of reactants to the liquid phase, test compounds 

10 which inhibit complex or which disrupt preformed complexes 
can be identified. 

In an alternate embodiment of the invention, a 
homogeneous assay can be used. In this approach, a 
preformed complex of the target gene product and the 

15 interactive cellular or extracellular binding partner 

product is prepared in which either the target gene products 
or their binding partners are labeled, but the signal 
generated by the label is quenched due to complex formation 
(see, e.g. . U.S. Patent No. 4,109,496 by Rubenstein which 

20 utilizes this approach for immunoassays) . The addition of a 
test substance that competes with and displaces one of the 
species from the preformed complex will result in the 
generation of a signal above background. In this way, test 
substances which disrupt target gene product-cellular or 

25 extracellular binding partner interaction can be identified. 

In a particular embodiment, the target gene product 
can be prepared for immobilization using recombinant DNA 
techniques described in Section 5.1.2, supra. For example, 
the target gene coding region can be fused to a glutathione- 

30 s-transf erase (GST) gene using a fusion vector such as pGEX- 
5X-1, in such a manner that its binding activity is 
maintained in the resulting fusion product. The interactive 
cellular or extracellular product can be purified and used 
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to raise a monoclonal antibody, using methods routinely 
practiced in the art and described above , in Section 5.2.4. 
This antibody can be labeled with the radioactive isotope 
125 I, for example, by methods routinely practiced in the 
art. In a heterogeneous assay, e.g. . the GST-Target gene 
fusion product can be anchored to glutathione-agarose beads. 
The interactive cellular or extracellular binding partner 
product can then be added in the presence or absence of the 
test compound in a manner that allows interaction and 
binding to occur. At the end of the reaction period, 
unbound material can be washed away, and the labeled 
monoclonal antibody can be added to the system and allowed 
to bind to the complexed components. The interaction 
between the target gene product and the interactive cellular 
or extracellular binding partner can be detected by 
measuring the amount of radioactivity that remains 
associated with the glutathione-agarose beads. A successful 
inhibition of the interaction by the test compound will 
result in a decrease in measured radioactivity. 

Alternatively, the GST-target gene fusion product 
and the interactive cellular or extracellular binding 
partner product can be mixed together in liquid in the 
absence of the solid glutathione-agarose beads. The test 
compound can be added either during or after the binding 
partners are allowed to interact. This mixture can then be 
added to the glutathione-agarose beads and unbound material 
is washed away. Again the extent of inhibition of the 
binding partner interaction can be detected by adding the 
labeled antibody and measuring the radioactivity associated 

with the beads. 

In another embodiment of the invention, these same 
techniques can be employed using peptide fragments that 
correspond to the binding domains of the target gene product 



and the interactive cellular or extracellular binding 
partner (in case where the binding partner is a product) , in 
place of one or both of the full length products. Any 
number of methods routinely practiced in the art can be used 
5 to identify and isolate the protein's binding site. These 
methods include, but are not limited to, mutagenesis of one 
of the genes encoding one of the products and screening for 
disruption of binding in a co-immunoprecipitation assay. 
Compensating mutations in the gene encoding the second 

10 species in the complex can be selected. Sequence analysis 
of the genes encoding the respective products will reveal 
the mutations that correspond to the region of the product 
involved in interactive binding. Alternatively, one product 
can be anchored to a solid surface using methods described 

15 in this Section above, and allowed to interact with and bind 
to its labeled binding partner, which has been treated with 
a proteolytic enzyme, such as trypsin. After washing, a 
short, labeled peptide comprising the binding domain can 
remain associated with the solid material, which can be 

20 isolated and identified by amino acid sequencing. Also, 
once the gene coding for the cellular or extracellular 
binding partner product is obtained, short gene segments can 
be engineered to express peptide fragments of the product, 
which can then be tested for binding activity and purified 

25 or synthesized. 

5.8.4. ASSAYS FOR AMELIORATION OF 
TUMOR PROGRESSION 8YMPT0MS 

Any of the binding compounds, including but not 
limited to, compounds such as those identified in the 
30 foregoing assay systems, can be tested for the ability to 

prevent and/or ameliorate symptoms of tumor progression and 
tumor progression disorders, including metastatic disease. 
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Cell-based and animal model-based assays for the 
identification of compounds exhibiting an ability to prevent 
and/or ameliorate tumor progression symptoms are described 
below* 

First, cell-based systems such as those described, 
above, in Section 5.7.2, can be used to identify compounds 
which can act to ameliorate symptoms of tumor progression . 
For example, such cell systems can be exposed to a compound, 
suspected to exhibiting an ability to ameliorate tumor 
progression symptoms, at a sufficient concentration and for 
a time sufficient to elicit such an amelioration in the 
exposed cells • After exposure, the cells are examined to 
determine whether one or more tumor progression state or 
tumor progression disorder phenotypes has been altered to 
resemble a more normal or more wild-type, non-neoplastic 
disease phenotype. 

Taking, as an example, tumor progression involving 
metastasis, cell-based systems such as the highly metastatic 
B16 F10 melanoma cell line can be utilized. Upon exposure 
to such cell systems, compounds can be assayed for their 
ability to reduce the metastatic potential of such cells. 
Further, the level of 030 gene expression within these cells 
may be assayed* Presumably, an increase in the observed 
level of 030 gene expression would indicate an amelioration 
of the metastatic tumor progression state. 

In addition, animal-based systems, such as those 
described, above, in Section 5.7.1, can be used to identify 
compounds capable of ameliorating symptoms of tumor 
progression* Such animal models can be used as test 
substrates for the identification of drugs, pharmaceuticals, 
therapies, and interventions which can be effective in 
treating tumor progression disorders. For example, animal 
models can be exposed to a compound suspected to exhibit an 



ability to ameliorate tumor progression symptoms, at a 
sufficient concentration and for a time sufficient to elicit 
such an amelioration in the exposed animals. The response 
of the animals to the exposure can be monitored by assessing 
the reversal of disorders associated with tumor progression. 
With regard to intervention, any treatments which reverse 
any aspect of symptoms of tumor progression, such as, for 
example, those associated with metastatic disease, should be 
considered as candidates for human therapeutic intervention 
in the treatment of tumor progression. Dosages of test 
agents can be determined by deriving dose-response curves, 
as discussed in Section 5.10, below. 

Further, gene expression patterns can be utilized to 
assess the ability of a compound to ameliorate symptoms of 
tumor progression and tumor progression disorders. For 
example, fingerprint gene expression or a fingerprint 
pattern can then be used in such an assessment. Fingerprint 
gene expression and fingerprint patterns are described, 
below, in Section 5.11. 

Fingerprint patterns can be characterized for known 
states f e.a. . normal or known pre-neoplastic, neoplastic or 
metastatic states) within the cell- and/or animal-based 
model systems. Subsequently, these known fingerprint 
patterns can be compared to ascertain the effect a test 
compound has to modify such fingerprint patterns, and to 
cause the pattern to more closely resemble that of a more 
desirable fingerprint pattern. 

For example, administration of a compound can cause 
the fingerprint pattern of a metastatic disease model system 
to more closely resemble a control, normal system. 
Administration of a compound can, alternatively, cause the 
fingerprint pattern of a control system to begin to mimic 
tumor progression states, such as metastatic disease states. 



5.8.5. MONITORING OF EFFECTS 
ntTRING CLINICAL TRIALS 

Monitoring the influence of compounds on tumor 
progression can be applied not only in basic drug screening, 
but also in clinical trials. In such clinical trials, the 
expression of a panel of genes that have been discovered in 
any one of the paradigms discovered in Section 5.1.1.1 can 
be used as a "read out" of the tumor progression state of a 
particular cell. 

For example, and not by way of limitation, the 
paradigm describing the B16 melanoma cells provides for the 
identification of fingerprint genes (e.g. , 030) that are 
down-regulated in metastatic tumor cells. For example, in a 
clinical trial, tumor cells can be isolated from the primary 
tumors removed by surgery, and RNA prepared and analyzed by 
differential display as described in Section 6.1. The 
levels of expression of the fingerprint genes can be 
quantified by Northern blot analysis or RT-PCR, as described 
in Section 6.1, or alternatively by measuring the amount of 
protein produced, by one of the methods described in Section 
5.7.2. In this way, the fingerprint profiles can serve as 
putative biomarkers indicative of the metastatic potential 
of the tumor cell. Thus, by monitoring the level of 
expression of romy030, a protocol for suitable 
chemotherapeutic anticancer drugs can be developed based on 
the metastatic potential of tumor cells in the primary. In 
cases of inoperable metastatic disease , patients can have 
biopsies removed for measurement of romy030 expression so 
that the drug's efficacy can be measured by monitoring the 
degree of restored expression of romy030. 



5.9. COMPOUNDS AND METHODS FOR 

TREATMENT OF TUMOR PROGRESSION 

Described herein are methods and compositions which 
can be used ameliorate symptoms of tumor progression and 
5 disorders involving tumor progression via, first, target 

gene modulation, and/or second, via a depletion of the cells 
involved in tumor progression. Target gene modulation can 
be of a positive or negative nature, depending on the 
specific situation involved, but each modulatory event 
10 yields a net result in which tumor progression symptoms are 
ameliorated. 

"Negative modulation," as used herein, refers to a 
reduction in the level and/ or activity of target gene 
product relative to the level and/or activity of the target 
15 gene product in the absence of the modulatory treatment. 

"Positive modulation," as used herein, refers to an 
increase in the level and/or activity of target gene product 
relative to the level and/or activity of target gene product 
in the absence of modulatory treatment. 
20 It is possible that tumor progression can be brought 

about, at least in part, by an abnormal level of gene 
product, or by the presence of a gene product exhibiting 
abnormal activity. As such, the reduction in the level 
and/ or activity of such gene products would bring about the 
25 amelioration of tumor progression symptoms. Negative 
modulatory techniques for the reduction of target gene 
expression levels or target gene product activity levels are 
discussed in Section 5.9.1, below. 

Alternatively, it is possible that tumor progression 
can be brought about, at least in part, by the absence or 
reduction of the level of gene expression, or a reduction in 
the level of a gene product's activity. As such, an 
increase in the level of gene expression and/or the activity 



30 
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of such gene products would bring about the amelioration of 
tumor progression symptoms. 

For example, as demonstrated in the Example 
presented in Section 6, below, a reduction in the level of 
5 030 gene expression correlates with a highly metastatic 
tumor progression state. A 030 positive modulatory 
technique which increased 03 0 gene expression in cells 
within a highly metastatic tumor progression state should, 
therefore, act to ameliorate the symptoms of such a state. 

10 Further, because the 030 gene product may exhibit general 
tumor suppressor features, it is possible that a 030 
positive modulatory technique could ameliorate symptoms of 
many tumor progression events. 

Positive modulatory techniques for increasing the 

15 target gene expression levels or target gene product 
activity levels are discussed in Section 5.9.2, below. 

Additionally, tumor progression treatment techniques 
whereby the concentration of cells involved in tumor 
progression are depleted are described, below, in Section 

20 5.9.3. 

Among the tumor progression events which may be 
treated are those associated with human tumors. Such human 
tumors may include, for example, human melanomas, breast, 
gastrointestinal, such as esophageal, stomach, colon, bowel, 
25 colorectal and rectal cancers, prostate, bladder, 

testicular, ovarian, uterine, cervical, brain, lung, 
bronchial, larynx, pharynx, liver, pancreatic, thyroid, 
bone, leukemias, lymphomas and various types of skin 
cancers. 
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5.9.1. NEGATIVE MODULATORY TECHNIQUES 
As discussed, above, successful treatment of tumor 
progression symptoms and of disorders involving tumor 
progression can be brought about by techniques which serve 
5 to inhibit the expression or activity of target gene 
products . 

For example, compounds such as those identified 
through assays described, above, in Section 5.8, which 
exhibit negative modulatory activity, can be used in 

10 accordance with the invention to prevent and/or ameliorate 
symptoms of tumor progression, including tumor progression 
involving metastatic disorders. As discussed in Section 
5.8., above, such molecules can include, but are not limited 
to peptides, phosphopeptides , small organic or inorganic 

15 molecules, or antibodies (including, for example, 

polyclonal, monoclonal, humanized, anti-idiotypic, chimeric 
or single chain antibodies, and FAb, F(ab') 2 and FAb 
expression library fragments, and epitope-binding fragments 
thereof) . Negative modulatory techniques involving antibody 

20 administration are described, below, in Section 5.9.1.2. 

Techniques for the determination and administration of such 
compounds are described, below, in Section 5.10. 

Further, antisense and ribozyme molecules which 
inhibit expression of the target gene can also be used in 

25 accordance with the invention to reduce the level of target 
gene expression, thus effectively reducing the level of 
target gene activity. Still further, triple helix molecules 
can be utilized in reducing the level of target gene 
activity. Such techniques are described, below, in Section 

30 5.9.1.1. 
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5.9.1.1. NEGATIVE MODULATORY ANT I SENSE, 

RIBOZYME AND TRIPLE HELIX APPROACHES 

Among the compounds which can exhibit the ability to 

prevent and/or ameliorate symptoms of tumor progression are 

5 antisense, ribozyme, and triple helix molecules. Such 

molecules can be designed to reduce or inhibit either wild 

type, or if appropriate, mutant target gene activity. 

Techniques for the production and use of such molecules are 

well known to those of skill in the art. 

10 Anti-sense RNA and DNA molecules act to directly 

block the translation of mRNA by hybridizing to targeted 
mRNA and preventing protein translation. With respect to 
antisense DNA, oligodeoxyribonucleotides derived from the 
translation initiation site, e.g. , between the -10 and +10 

15 regions of the Target gene nucleotide sequence of interest, 
are preferred. 

Ribozymes are enzymatic RNA molecules capable of 
catalyzing the specific cleavage of RNA. (For a review, 
see, for example, Rossi, J., 1994, Current Biology 4:469- 

20 471) . The mechanism of ribozyme action involves sequence 
specific hybridization of the ribozyme molecule to 
complementary target RNA, followed by a endonucleolytic 
cleavage. The composition of ribozyme molecules must 
include one or more sequences complementary to the target 

25 gene mRNA and must include the well-known catalytic sequence 
responsible for mRNA cleavage. For this sequence, see U.S. 
Pat. No. 5,093,246, which is incorporated by reference 
herein in its entirety. As such within the scope of the 
invention are engineered hammerhead motif ribozyme molecules 

30 that specifically and efficiently catalyze endonucleolytic 
cleavage of RNA sequences encoding target gene proteins. 

Specific ribozyme cleavage sites within any 
potential RNA target are initially identified by scanning 
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the molecule of interest for ribozyme cleavage sites which 
include the following sequences, GUA, GUU, and GUC. Once 
identified, short RNA sequences of between 15 and 20 
ribonucleotides corresponding to the region of the target 
5 gene containing the cleavage site can be evaluated for 

predicted structural features, such as secondary structure, 
that can render the oligonucleotide sequence unsuitable. 
The suitability of candidate sequences can also be evaluated 
by testing their accessibility to hybridization with 

10 complementary oligonucleotides, using ribonuclease 
protection assays* 

Nucleic acid molecules to be used in triplex helix 
formation for the inhibition of transcription should be 
single stranded and composed of deoxynucleotides. The base 

15 composition of these oligonucleotides must be designed to 
promote triple helix formation via Hoogsteen base pairing 
rules, which generally require sizeable stretches of either 
purines or pyrimidines to be present on one strand of a 
duplex. Nucleotide sequences can be pyrimidine-based, which 

20 will result in TAT and CGC + triplets across the three 
associated strands of the resulting triple helix. The 
pyrimidine-rich molecules provide base complementarily to a 
purine-rich region of a single strand of the duplex in a 
parallel orientation to that strand. In addition, nucleic 

25 acid molecules can be chosen that are purine-rich, for 

example, contain a stretch of G residues. These molecules 
will form a triple helix with a DNA duplex that is rich in 
GC par is, in which the majority of the purine residues are 
located on a single strand of the targeted duplex, resulting 

30 in GGC triplets across the three strands in the triplex. 

Alternatively, the potential sequences that can be 
targeted for triple helix formation can be increased by 
creating a so called "switchback" nucleic acid molecule. 



Switchback molecules are synthesized in an alternating 5'- 
3 / f 3 '-5' manner, such that they base pair with first one 
strand of a duplex and then the other, eliminating the 
necessity for a sizeable stretch of either purines or 
pyrimidines to be present on one strand of a duplex. 

In instances wherein the antisense, ribozyme, and/ or 
triple helix molecules described herein are utilized to 
reduce or inhibit mutant gene expression, it is possible 
that the technique utilized can also efficiently reduce or 
inhibit the transcription (triple helix) and/or translation 
(antisense, ribozyme) of mRNA produced by normal target gene 
alleles such that the possibility can arise wherein the 
concentration of normal target gene product present can be 
lower than is necessary for a normal phenotype. In such 
cases, to ensure that substantially normal levels of target 
gene activity are maintained, nucleic acid molecules that 
encode and express target gene polypeptides exhibiting 
normal target gene activity can be introduced into cells via 
gene therapy methods such as those described, below, in 
Section 5.9.2 that do not contain sequences susceptible to 
whatever antisense, ribozyme, or triple helix treatments are 
being utilized. Alternatively, in instances whereby the 
target gene encodes an extracellular protein, it can be 
preferable to coadminister normal target gene protein into 
the cell or tissue in order to maintain the requisite level 
of cellular or tissue target gene activity. 

Anti-sense RNA and DNA, ribozyme and triple helix 
molecules of the invention can be prepared by any method 
known in the art for the synthesis of DNA and RNA molecules. 
These include techniques for chemically synthesizing 
oligodeoxyribonucleotides and oligoribonucleotides well 
known in the art such as, for example, solid phase 
phosphoramidite chemical synthesis. Alternatively, RNA 



molecules can be generated by in vitro and in. vivg 
transcription of DNA sequences encoding the antisense RNA 
molecule. Such DNA sequences can be incorporated into a 
wide variety of vectors which incorporate suitable RNA 
polymerase promoters such as the T7 or SP6 polymerase 
promoters. Alternatively, antisense cDNA constructs that 
synthesize antisense RNA constitutively or inducibly, 
depending on the promoter used, can be introduced stably 

into cell lines. 

Various well-known modifications to the DNA 
molecules can be introduced as a means of increasing 
intracellular stability and half -life. Possible 
modifications include but are not limited to the addition of 
flanking sequences of ribo- or deoxy- nucleotides to the 5' 
and/or 3' ends of the molecule or the use of 
phosphorothioate or 2' O-methyl rather than phospho- 
diesterase linkages within the oligodeoxyribonucleotide 
backbone . 

5.9.1.2. NEGATIVE MODULATORY ANTIBODY TECHNIQUES 

Antibodies can be generated which are both specific 
for target gene product and which reduce target gene product 
activity. Such antibodies may, therefore, by administered 
in instances whereby negative modulatory techniques are 
appropriate for the treatment of tumor progression. 
Antibodies can be generated using standard techniques 
described in Section 5.6, above, against the proteins 
themselves or against peptides corresponding to portions of 
the proteins. The antibodies include but are not limited to 
polyclonal, monoclonal, Fab fragments, single chain 
antibodies, chimeric antibodies, and the like. 

In instances where the target gene protein to which 
the antibody is directed is intracellular and whole 



antibodies are used, internalizing antibodies can be 
preferred. However, lipofectin or liposomes can be used to 
deliver the antibody or a fragment of the Fab region which 
binds to the target gene epitope into cells. Where 
fragments of the antibody are used, the smallest inhibitory 
fragment which binds to the target protein's binding domain 
is preferred. For example, peptides having an amino acid 
sequence corresponding to the domain of the variable region 
of the antibody that binds to the target gene protein can be 
used. Such peptides can be synthesized chemically or 
produced via recombinant DNA technology using methods well 
known in the art ( e.g. . see Creighton, 1983, supra; and 
Sambrook ejL-aLs./ 1989, supra). Alternatively, single chain 
neutralizing antibodies which bind to intracellular target 
gene product epitopes can also be administered. Such 
single chain antibodies can be administered, for example, by 
expressing nucleotide sequences encoding single-chain 
antibodies within the target cell population by utilizing, 
for example, techniques such as those described in Marasco 
et al. (Marasco, W. et al. . 1993, Proc. Natl. Acad. Sci. USA 
9_0_: 7889-7893) . 

In instances where the target gene protein is 
extracellular, or is a transmembrane protein, any of the 
administration techniques described, below in Section 5.10 
which are appropriate for peptide administration can be 
utilized to effectively administer inhibitory target gene 
antibodies to their site of action. 

5.9.2. POSITIVE MODULATORY TECHMIOUE8 
As discussed above, successful treatment of tumor 
progression symptoms and of disorders involving tumor 
progression can be brought about by techniques which serve 



to increase the level of target gene expression or to 
increase the activity of a target gene product. 

For example, compounds such as those identified 
through assays described, above, in Section 5.8, which 
5 exhibit positive modulatory activity can be used in 
accordance with the invention to ameliorate tumor 
progression symptoms. As discussed in Section 5.8, above, 
such molecules can include, but are not limited to, 
peptides, phosphopeptides, small organic or inorganic 

10 molecules, or antibodies (including, for example, 

polyclonal, monoclonal, humanized, anti-idiotypic, chimeric 
or single chain antibodies, and FAb, F(ab') 2 and FAb 
expression library fragments, and epitope-binding fragments 
thereof) . Positive modulatory techniques involving antibody 

15 administration are described, below, in Section 5.9.2.1. 

For example, a target gene protein, at a level 
sufficient to ameliorate tumor progression symptoms can be 
administered to a patient exhibiting such symptoms. Any of 
the techniques discussed, below, in Section 5.10, can be 

20 utilized for such administration. One of skill in the art 
will readily know how to determine the concentration of 
effective, non-toxic doses of the normal target gene 
protein, utilizing techniques such as those described, 
below, in Section 5.10.1. 

25 in instances wherein the compound to be administered 

is a peptide compound, DNA sequences encoding the peptide 
compound can, alternatively, be directly administered to a 
patient exhibiting tumor progression symptoms, at a 
concentration sufficient to generate the production of an 

3 0 amount of target gene product adequate to ameliorate tumor 
progression symptoms. Any of the techniques described, 
below, in Section 5.10, which achieve intracellular 
administration, can be utilized for the administration of 



such DNA molecules. The DNA molecules can be produced, for 
example, by well-known recombinant techniques. 

In the case of peptide compounds which act 
extracellularly, the DNA molecules encoding such peptides 
can be taken up and expressed by any cell type, so long as a 
sufficient circulating concentration of peptide results for 
the elicitation of a reduction in tumor progression 
symptoms . 

In the case of compounds which act intracellular ly, 
the DNA molecules encoding such peptides must be taken up 
and expressed by cells involved in the tumor progression at 
a sufficient level to bring about the reduction of tumor 
progression symptoms. 

Any technique which serves to selectively administer 
DNA molecules to a cell involved in tumor progression is, 
therefore, preferred for the DNA molecules encoding 
intracellular ly acting peptides. 

Further, patients can be treated for symptoms of 
tumor progression by gene replacement therapy. One or more 
copies of a normal target gene or a portion of the gene that 
directs the production of a normal target gene protein with 
target gene function can be inserted into cells, using 
vectors which include, but are not limited to adenovirus, 
adeno-associated virus, and retrovirus vectors, in addition 
to other particles that introduce DNA into cells, such as 
liposomes. Techniques such as those described above can be 
utilized for the introduction of normal target gene 
sequences into human cells. 

In instances wherein the target gene encodes an 
extracellular, secreted gene product, such gene replacement 
techniques may be accomplished either in vivo or in vitro. 
For such cases, the cell types expressing the target gene is 
less important than achieving a sufficient circulating 



concentration of the extracellular molecules for the 
amelioration of tumor progression symptoms to occur. In 
vitro, target gene sequences can be introduced into 
autologous cells. Those cells expressing the target gene 
sequence of interest can then be reintroduced, preferably by 
intravenous administration, into the patient such that there 
results an amelioration of tumor progression symptoms. 

In instances wherein the gene replacement involves a 
gene which encodes a product which acts intracellularly, it 
is preferred that gene replacement be accomplished in vivo. 
Further, because the cell type in which the gene replacement 
must occur is the cell type involved in tumor progression, 
such techniques must successfully target such tumor 
progression cells. 

Taking the 03 0 gene as an example, an increase in 
030 expression can serve to ameliorate tumor progression 
symptoms, such as, for example, tumor progression symptoms 
involving metastatic processes. Therefore, any positive 
modulatory described herein which increases the 030 gene 
product or gene product activity to a level which is 
sufficient to ameliorate tumor progression symptoms 
represents a successful tumor progression therapeutic 
treatment. 

5.9.3. METHODS FOR DEPLETING CELLS 

INVOLVED IN TUMOR PROGRESSION 

Techniques described herein can be utilized to 

deplete the total number of cells involved in tumor 

progression, thus effectively decreasing the ratio of the 

tumor cells to non-cancerous cells. Specifically, 

separation techniques are described which can be used to 

deplete the total number of tumor cells present within a 

cell population, and, further, targeting techniques are 



described which can be utilized to deplete specific tumor 
cell subpopulations. 

Depending on the particular application, changing 
the number of cells belonging to tumor cell population can 
yield inhibitory responses leading to the amelioration of 
cancerous disorders. 

The separation techniques described herein are based 
on the presence or absence of specific cell surface, 
preferably transmembrane, markers* By way of example, and 
not by way of limitation, the techniques described herein 
utilize tumor specific cell surface markers or antigens and 
will describe procedures whereby tumor cells can be 
separated from other cells, thus allowing for selective 
depletion of tumor cells. 

Separation techniques can be utilized which separate 
and purify cells, tumor cells, for example, in vitro from a 
population of cells, such as hematopoietic cells autologous 
to the patient being treated* For example, an initial tumor 
cell subpopulation-containing population of cells, such as 
hematopoietic cells, can be obtained from a leukemia patient 
using standard procedures well known to those of skill in 
the art* Peripheral blood can be utilized as one potential 
starting source for such techniques, and can, for example, 
be obtained via venipuncture and collection into heparinized 
tubes . 

Once the starting source of autologous cells is 
obtained, tumor cells can be removed, and thus selectively 
separated and purified, by various methods which utilize 
antibodies which bind specific markers present on tumor 
cells while absent on other cells within the starting 
source* These techniques can include, for example, flow 
cytometry using a fluorescence activated cell sorter (FACS) 
and specific f luorochromes, biotin-avidin or 



biotin-streptavidin separations using biotin conjugated to 
cell surface marker-specific antibodies and avidin or 
streptavidin bound to a solid support such as affinity 
column matrix or plastic surfaces or magnetic separations 
using antibody-coated magnetic beads. 

Separation via antibodies for specific markers can 
be by negative or positive selection procedures . In 
negative separation, antibodies are used which are specific 
for markers present on undesired cells, in this case tumor 
cells, which exhibit, for example, the tumor specific cell 
surface marker. Cells bound by an antibody to such a cell 
surface marker can be removed or lysed and the remaining 
desired mixture retained. In positive separation, 
antibodies specific for markers present on the desired cells 
of interest, in this case tumor-like cells, are used. Cells 
bound by the antibody are separated and retained. It will 
be understood that positive and negative separations can be 
used substantially simultaneously or in a sequential manner. 

A common technique for antibody based separation is 
the use of flow cytometry such as by a florescence activated 
cell sorter (FACS) . Typically, separation by flow cytometry 
is performed as follows. The suspended mixture of cells are 
centrifuged and resuspended in media. Antibodies which are 
conjugated to f luorochrome are added to allow the binding of 
the antibodies to specific cell surface markers. The cell 
mixture is then washed by one or more centrif ugation and 
resuspension steps. The mixture is run through a FACS which 
separates the cells based on different fluorescence 
characteristics. FACS systems are available in varying 
levels of performance and ability, including multi-color 
analysis. The facilitating cell can be identified by a 
characteristic profile of forward and side scatter which is 
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influenced by size and granularity, as well as by positive 
and/or negative expression of certain cell surface markers. 

Other separation techniques besides flow cytometry 
can also provide fast separations. One such method is 
biotin-avidin based separation by affinity chromatography. 
Typically, such a technique is performed by incubating cells 
with biotin-coupled antibodies to specific markers, such as, 
for example, the transmembrane protein encoded by the tumor- 
specific marker, followed by passage through an avidin 
column. Biotin-antibody-cell complexes bind to the column 
via the biotin-avidin interaction, while other cells pass 
through the column. The specificity of the biotin-avidin 
system is well suited for rapid positive separation. 
Multiple passages can ensure separation of a sufficient 
level of the tumor cell subpopulation of interest. 

In instances whereby the goal of the separation 
technique is to deplete the overall number of cells 
belonging to the tumor cell subpopulation, the cells derived 
from the starting source of cells which has now been 
effectively depleted of tumor cells can be reintroduced into 
the patient. Such a depletion of the tumor cell 
subpopulation results in the amelioration of cancerous 
disorders associated with tumor progression. 

In instances whereby the goal of the separation 
technique is to augment or increase the overall number of 
cells belonging to a non-cancerous cell subpopulation, cells 
derived from the purified normal cell subpopulation can be 
reintroduced into the patient, thus resulting in the 
amelioration of cancerous disorders associated with an under 
activity of the normal cell subpopulation. 

The cells to be reintroduced will be cultured and 
expanded ex vivo prior to re introduction. Purified normal 
cell subpopulation cells can be washed, suspended in, for 



example , buffered saline, and reintroduced into the patient 
via intravenous administration. 

Cells to be expanded can be cultured, using standard 
procedures, in the presence of an appropriate expansion 
agent which induces proliferation of the purified normal 
cell subpopulation. Such an expansion agent can, for 
example, be any appropriate cytokine, antigen, or antibody* 

Prior to being reintroduced into a patient, the 
purified normal cells can be modified by, for example, 
transformation with gene sequences encoding gene products of 
interest. Such gene products should represent products 
which enhance the activity of the purified normal cell 
subpopulation or, alternatively, represent products which 
repress the activity of one or more of the other normal cell 
subpopulations. Cell transformation and gene expression 
procedures are well known to those of skill in the art, and 
can be as those described, above, in Section 5.2. 

Well-known targeting methods can, additionally, be 
utilized in instances wherein the goal is to deplete the 
number of cells belonging to a specific tumor cell 
subpopulation. - Such targeting methods can be in vivo or in 
vitro, and can involve the introduction of targeting agents 
into a population of cells such that the targeting agents 
selectively destroy a specific subset of the cells within 
the population. In vivo administration techniques which can 
be followed for such targeting agents are described, below, 
in Section 5. 10. 

Targeting agents generally comprise, first, a 
targeting moiety which, in the current instance, causes the 
targeting agent to selectively associate with a specific 
tumor cell subpopulation. The targeting agents generally 
comprise, second, a moiety capable of destroying a cell with 
which the targeting agent has become associated. 



Targeting moieties can include, but are not limited 
to, antibodies directed to cell surface markers found 
specifically on the tumor cell subpopulation being targeted, 
or, alternatively, to ligands, such as growth factors, which 
bind receptor-type molecules found exclusively on the 
targeted tumor cell subpopulation. 

Destructive moieties include any moiety capable of 
inactivating or destroying a cell to which the targeting 
agent has become bound. For example, a destructive moiety 
can include, but it is not limited to cytotoxins or 
radioactive agents. Cytotoxins include, for example, plant- 
, fungus-, or bacteria-derived toxins, with deglycosylated 
Ricin A chain toxins being generally preferred due to their 
potency and lengthy half -lives. 

5.10. PHARMACEUTICAL PREPARATIONS 

AND METH0D3 OP ADMINISTRATION 

The identified compounds that inhibit target gene 

expression, synthesis and/or activity can be administered to 

a patient at therapeutically effective doses to prevent, 

treat or ameliorate tumor progression. A therapeutically 

effective dose refers to that amount of the compound 

sufficient to result in amelioration of symptoms of tumor 

progression. 

5.10. 1. EFFECTIVE DOSE 
Toxicity and therapeutic efficacy of such compounds 
can be determined by standard pharmaceutical procedures in 
cell cultures or experimental animals, e.g. , for determining 
the LD 50 (the dose lethal to 50% of the population) and the 
ED 50 (the dose therapeutically effective in 50% of the 
population) . The dose ratio between toxic and therapeutic 
effects is the therapeutic index and it can be expressed as 



the ratio LD 50 /ED 50 . Compounds which exhibit large 
therapeutic indices are preferred. While compounds that 
exhibit toxic side effects can be used, care should be taken 
to design a delivery system that targets such compounds to 
the site of affected tissue in order to minimize potential 
damage to uninfected cells and, thereby, reduce side 
effects. 

The data obtained from the cell culture assays and 
animal studies can be used in formulating a range of dosage 
for use in humans. The dosage of such compounds lies 
preferably within a range of circulating concentrations that 
include the ED 50 with little or no toxicity. The dosage can 
vary within this range depending upon the dosage form 
employed and the route of administration utilized. For any 
compound used in the method of the invention, the 
therapeutically effective dose can be estimated initially 
from cell culture assays. A dose can be formulated in 
animal models to achieve a circulating plasma concentration 
range that includes the IC 50 ( i.e. , the concentration of the 
test compound which achieves a half-maximal inhibition of 
symptoms) as determined in cell culture. Such information 
can be used to more accurately determine useful doses in 
humans. Levels in plasma can be measured, for example, by 
high performance liquid chromatography. 

5.10.2. FORMULATIONS AND U8E 
Pharmaceutical compositions for use in accordance 
with the present invention can be formulated in conventional 
manner using one or more physiologically acceptable carriers 
or excipients. 

Thus, the compounds and their physiologically 
acceptable salts and solvates can be formulated for 
administration by inhalation or insufflation (either through 
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the mouth or the nose) or oral, buccal, parenteral or rectal 
administration. 

For oral administration, the pharmaceutical 
compositions can take the form of, for example, tablets or 
capsules prepared by conventional means with 
pharmaceutical ly acceptable excipients such as binding 
agents ( e.g. . pregelatinised maize starch, 
polyvinylpyrrolidone or hydroxypropyl methylcellulose) ; 
fillers f e.g. . lactose, microcrystalline cellulose or 
calcium hydrogen phosphate) ; lubricants (e.g. , magnesium 
stearate, talc or silica) ; disintegrants (e.g. , potato 
starch or sodium starch glycolate) ; or wetting agents ( e.g. , 
sodium lauryl sulphate) . The tablets can be coated by 
methods well known in the art. Liguid preparations for oral 
administration can take the form of, for example, solutions, 
syrups or suspensions, or they can be presented as a dry 
product for constitution with water or other suitable 
vehicle before use. Such liguid preparations can be 
prepared by conventional means with pharmaceutically 
acceptable additives such as suspending agents ( e.g. , 
sorbitol syrup, cellulose derivatives or hydrogenated edible 
fats) ; emulsifying agents ( e.g. . lecithin or acacia) ; non- 
aqueous vehicles ( e.g. . almond oil, oily esters, ethyl 
alcohol or fractionated vegetable oils) ; and preservatives 
{ e.g. . methyl or propyl-p-hydroxybenzoates or sorbic acid) . 
The preparations can also contain buffer salts, flavoring, 
coloring and sweetening agents as appropriate. 

Preparations for oral administration can be suitably 
formulated to give controlled release of the active 
compound . 

For buccal administration the compositions can take 
the form of tablets or lozenges formulated in conventional 
manner . 
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For administration by inhalation, the compounds for 
use according to the present invention are conveniently 
delivered in the form of an aerosol spray presentation from 
pressurized packs or a nebulizer, with the use of a suitable 
propellant, e.g. , dichlorodif luoromethane, 
trichlorof luoromethane , dichlorotetraf luoroethane, carbon 
dioxide or other suitable gas. In the case of a pressurized 
aerosol the dosage unit can be determined by providing a 
valve to deliver a metered amount. Capsules and cartridges 
of e.g. . gelatin for use in an inhaler or insufflator can be 
formulated containing a powder mix of the compound and a 
suitable powder base such as lactose or starch. 

The compounds can be formulated for parenteral 
administration by injection, e.g. , by bolus injection or 
continuous infusion. Formulations for injection can be 
presented in unit dosage form, e.g. , in ampoules or in . 
multi-dose containers, with an added preservative. The 
compositions can take such forms as suspensions, solutions 
or emulsions in oily or aqueous vehicles, and can contain 
formulatory agents such as suspending, stabilizing and/ or 
dispersing agents. Alternatively, the active ingredient can 
be in powder form for constitution with a suitable vehicle, 
e.g. , sterile pyrogen- free water, before use. 

The compounds can also be formulated in rectal 
compositions such as suppositories or retention enemas, 
e.g. , containing conventional suppository bases such as 
cocoa butter or other glycerides. 

In addition to the formulations described 
previously, the compounds can also be formulated as a depot 
preparation. Such long acting formulations can be 
administered by implantation (for example, subcutaneous ly or 
intramuscularly) or by intramuscular injection. Thus, for 
example, the compounds can be formulated with suitable 
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polymeric or hydrophobic materials (for example as an 
emulsion in an acceptable oil) or ion exchange resins, or as 
sparingly soluble derivatives, for example, as a sparingly 
soluble salt. 

The compositions can, if desired, be presented in a 
pack or dispenser device which can contain one or more unit 
dosage forms containing the active ingredient. The pack can 
for example comprise metal or plastic foil, such as a 
blister pack. The pack or dispenser device can be 
accompanied by instructions for administration. 

5.11. DIAGNOSIS OF TUMOR PROGRESSION 

A variety of methods can be employed for the 
diagnosis of tumor progression and of disorders involving 
tumor progression, including metastatic diseases. Such 
methods can, for example, utilize reagents such as 
fingerprint gene nucleotide sequences described in Sections 
5.2.1, and antibodies directed against differentially 
expressed and pathway gene peptides, as described, above, in 
Section 5.2.1 (peptides) and 5.2.3 (antibodies). 
Specifically, such reagents can be used, for example, for 
the detection of the presence of target gene mutations, or 
the detection of either over or under expression of target 
gene in RNA. 

The methods described herein can be performed, for 
example, by utilizing pre-packaged diagnostic kits 
comprising at least one specific fingerprint gene nucleic 
acid or anti-fingerprint gene antibody reagent described 
herein, which can be conveniently used, e.g. , in clinical 
settings, to diagnose patients exhibiting symptoms of 
metastatic diseases. 
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Any cell type or tissue, preferably T-cells, in 
which the fingerprint gene is expressed can be utilized in 
the diagnostics described below. 



5. 11.1* DETECTION OP FINGERPRINT 
5 GENE NUCLEIC ACID8 

DNA or RNA from the cell type or tissue to be 

analyzed can easily be isolated using procedures which are 

well known to those in the art. Diagnostic procedures can 

also be performed "in situ " directly upon tissue sections 

10 (fixed and/or frozen) of patient tissue obtained from 
biopsies or resections , such that no nucleic acid 
purification is necessary. Nucleic acid reagents such as 
those described in Section 5.1 can be used as probes and/or 
primers for such in situ procedures (see, for example, 

15 Nuovo, G.J., 1992, PGR in situ hybridization: Protocols and 
Applications, Raven Press, NY) . 

Fingerprint gene nucleotide sequences, either RNA or 
DNA, can, for example, be used in hybridization or 
amplification assays of biological samples to detect gene 

20 structures and expression associated with metastasis. Such 
assays can include, but are not limited to, Southern or 
Northern analyses, single stranded conformational 
polymorphism analyses, in situ hybridization assays, and, 
polymerase chain reaction analyses. Such analyses can 

25 reveal both quantitative aspects of the expression pattern 
of the fingerprint gene, and qualitative aspects of the 
fingerprint gene expression and/or gene composition. That 
is, such techniques can include, for example, point 
mutations, insertions, deletions, chromosomal 

30 rearrangements, and/or activation or inactivation of gene 
expression. 
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Preferred diagnostic methods for the detection of 
fingerprint gene-specific nucleic acid molecules can involve 
for example , contacting and incubating nucleic acids, 
derived from the cell type or tissue being analyzed, with 
5 one or more labeled nucleic acid reagents as are described 
in Section 5.1, under conditions favorable for the specific 
annealing of these reagents to their complementary sequences 
within the nucleic acid molecule or interest. Preferably, 
the lengths of these nucleic acid reagents are at least 15 

10 to 30 nucleotides. After incubation, all non-annealed 

nucleic acids are removed from the nucleic acid: fingerprint 
RNA molecule hybrid. The presence of nucleic acids from the 
target tissue which have hybridized, if any such molecules 
exist, is then detected. Using such a detection scheme, the 

15 nucleic acid from the tissue or cell type of interest can be 
immobilized, for example, to a solid support such as a 
membrane, or a plastic surface such as that on a microtitre 
plate or polystyrene beads. In this case, after incubation, 
non-annealed, labeled fingerprint nucleic acid reagents of 

20 the type described in Section 5.1 are easily removed. 

Detection of the remaining, annealed, labeled nucleic acid 
reagents is accomplished using standard techniques well- 
known to those in the art. 

Alternative diagnostic methods for the detection of 

25 fingerprint gene specific nucleic acid molecules can involve 
their amplification, e.g. , by PCR (the experimental 
embodiment set forth in Mullis, K.B., 1987, U.S. Patent No. 
4,683,202), ligase chain reaction (Barany, F. , 1991, Proc. 
Natl. Acad. Sci. USA 88:189-193), self sustained sequence 

30 replication (Guatelli, J.C. et al. , 1990, Proc. Natl. Acad. 
Sci. USA 87:1874-1878), transcriptional amplification system 
(Kwoh, D.Y et al. , 1989, Proc. Natl. Acad. Sci. USA 86:1173- 
1177), Q-Beta Replicase (Lizardi, P.M. et al. , 1988, 



Bio/Technology £:1197), or any other nucleic acid 
amplification method, followed by the detection of the 
amplified molecules using techniques well known to those of 
skill in the art. These detection schemes are especially 
5 useful for the detection of nucleic acid molecules if such 
molecules are present in very low numbers. 

In one embodiment of such a detection scheme, a cDNA 
molecule is obtained from an RNA molecule of interest (e.g. , 
by reverse transcription of the RNA molecule into cDNA) . 

10 Cell types or tissues from which such RNA can be isolated 
include any tissue in which wild type fingerprint gene is 
known to be expressed. A sequence within the cDNA is then 
used as the template for a nucleic acid amplification 
reaction, such as a PCR amplification reaction, or the like. 

15 The nucleic acid reagents used as synthesis initiation 

reagents ( e.g. , primers) in the reverse transcription and 
nucleic acid amplification steps of this method are chosen 
from among the fingerprint gene nucleic acid reagents 
described in Section 5.1. The preferred lengths of such 

20 nucleic acid reagents are at least 19-3 0 nucleotides. For 
detection of the amplified product, the nucleic acid 
amplification can be performed using radioactively or non- 
radioactive^ labeled nucleotides. Alternatively, enough 
amplified product can be made such that the product can be 

25 visualized by standard ethidium bromide staining or by 

utilizing any other suitable nucleic acid staining method. 

In addition to methods which focus primarily on the 
detection of one nucleic acid sequence, fingerprint 
profiles, as discussed in Section 5.3.4., can also be 

30 assessed in such detection schemes. Fingerprint profiles 
can be generated, for example, by utilizing a differential 
display procedure, as discussed above in 5.1.1.2, Northern 
analysis and/or RT-PCR. Any of the gene sequences 



described, above, in Section 5.2.1 can be used as probes 
and/or PCR primers for the generation and corroboration of 
such fingerprint profiles. 



5.11.2. DETECTION OF TARGET GENE PEPTIDE8 

5 Antibodies directed against wild type or mutant 

fingerprint gene peptides, which are discussed, above, in 
Section 5.2.3, can also be used in tumor progression 
diagnostics and prognostics, as described, for example, 
herein. Such diagnostic methods, can be used to detect 

10 abnormalities in the level of fingerprint gene protein 

expression, or abnormalities in the structure and/or tissue, 
cellular, or subcellular location of fingerprinting gene 
protein. Structural differences can include, for example, 
differences in the size, electronegativity, or antigenicity 

15 of the mutant fingerprint gene protein relative to the 
normal fingerprint gene protein. 

Protein from the tissue or cell type to be analyzed 
can easily be isolated using techniques which are well known 
to those of skill in the art. The protein isolation methods 

20 employed herein can, for example, be such as those described 
in Harlow and Lane (Harlow, E. and Lane, D., 1988, 
"Antibodies: A Laboratory Manual," Cold Spring Harbor 
Laboratory Press, Cold Spring Harbor, New York), which is 
incorporated herein by reference in its entirety. 

2 5 Preferred diagnostic methods for the detection of 

wild type or mutant fingerprint gene peptide molecules can 
involve, for example, immunoassays wherein fingerprint gene 
peptides are detected by their interaction with an anti- 
fingerprint gene specific peptide antibody. 

3 0 For example, antibodies, or fragments of antibodies, 

such as those described, above, in Section 5.2.3, useful in 
the present invention can be used to quantitatively or 



qualitatively detect the presence of wild type or mutant 
fingerprint gene peptides. This can be accomplished, for 
example, by immunofluorescence techniques employing a 
fluorescent ly labeled antibody (see below) coupled with 
5 light microscopic, flow cytometric, or fluorimetric 

detection. Such techniques are especially preferred if the 
fingerprint gene peptides are expressed on the cell surface. 

The antibodies (or fragments thereof) useful in the 
present invention can, additionally, be employed 

10 histologically, as in immunofluorescence or immunoelectron 
microscopy, for in situ detection of target gene peptides. 
in situ detection can be accomplished by removing a 
histological specimen from a patient, and applying thereto a 
labeled antibody of the present invention. The antibody (or 

15 fragment) is preferably applied by overlaying the labeled 
antibody (or fragment) onto a biological sample. Through 
the use of such a procedure, it is possible to determine not 
only the presence of the fingerprint gene peptides, but also 
their distribution in the examined tissue. Using the 

20 present invention, those of ordinary skill will readily 

perceive that any of a wide variety of histological methods 
(such as staining procedures) can be modified in order to 
achieve such in situ detection. 

Immunoassays for wild type or mutant fingerprint 

25 gene peptides typically comprise incubating a biological 
sample, such as a biological fluid, a tissue extract, 
freshly harvested cells, or cells which have been incubated 
in tissue culture, in the presence of a detectably labeled 
antibody capable of identifying fingerprint gene peptides, 

3 0 and detecting the bound antibody by any of a number of 
techniques well-known in the art. 

The biological sample can be brought in contact with 
and immobilized onto a solid phase support or carrier such 
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as nitrocellulose , or other solid support which is capable 
of immobilizing cells, cell particles or soluble proteins. 
The support can then be washed with suitable buffers 
followed by treatment with the detectably labeled 
5 fingerprint gene specific antibody. The solid phase support 
can then be washed with the buffer a second time to remove 
unbound antibody. The amount of bound label on solid 
support can then be detected by conventional means. 

By "solid phase support or carrier" is intended any 

10 support capable of binding an antigen or an antibody. Well- 
known supports or carriers include glass, polystyrene, 
polypropylene, polyethylene, dextran, nylon, amylases, 
natural and modified celluloses, polyacrylamides, gabbros, 
and magnetite. The nature of the carrier can be either 

15 soluble to some extent or insoluble for the purposes of the 
present invention. The support material can have virtually 
any possible structural configuration so long as the coupled 
molecule is capable of binding to an antigen or antibody. 
Thus, the support configuration can be spherical, as in a 

20 bead, or cylindrical, as in the inside surface of a test 

tube, or the external surface of a rod. Alternatively, the 
surface can be flat such as a sheet, test strip, etc. 
Preferred supports include polystyrene beads. Those skilled 
in the art will know many other suitable carriers for 

25 binding antibody or antigen, or will be able to ascertain 
the same by use of routine experimentation. 

The binding activity of a given lot of anti-wild 
type or mutant fingerprint gene peptide antibody can be 
determined according to well-known methods. Those skilled 

30 in the art will be able to determine operative and optimal 
assay conditions for each determination by employing routine 
experimentation . 
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One of the ways in which the fingerprint gene 
peptide-specif ic antibody can be detectably labeled is by 
linking the same to an enzyme and use in an enzyme 
immunoassay (EIA) (Voller, A., "The Enzyme Linked 
5 Immunosorbent Assay (ELISA) , " Diagnostic Horizons 2:1-7, 
1978) (Microbiological Associates Quarterly Publication, 
Walkersville, MD) ; Voller, A. et al. , J . Clin. Pathol. 
31:507-520 (1978); Butler, J.E., Meth. Enzymol. 73:482-523 
(1981); Maggio, E. (ed.), ENZYME IMMUNOASSAY, CRC Press, 

10 Boca Raton, FL, 1980; Ishikawa, E. et al. . (eds.) ENZYME 

IMMUNOASSAY, Kgaku Shoin, Tokyo, 1981). The enzyme which is 
bound to the antibody will react with an appropriate 
substrate, preferably a chromogenic substrate, in such a 
manner as to produce a chemical moiety which can be 

15 detected, for example, by spectrophotometric, fluorimetric 
or by visual means. Enzymes which can be used to detectably 
label the antibody include, but are not limited to, malate 
dehydrogenase, staphylococcal nuclease, delta-5-steroid 
isomerase, yeast alcohol dehydrogenase, alpha- 

20 glycerophosphate, dehydrogenase, triose phosphate isomerase, 
horseradish peroxidase, alkaline phosphatase, asparaginase, 
glucose oxidase, beta-galactosidase, ribonuclease, urease, 
catalase, glucose-6-phosphate dehydrogenase, glucoamylase 
and acetylcholinesterase. The detection can be accomplished 

25 by colorimetric methods which employ a chromogenic substrate 
for the enzyme. Detection can also be accomplished by 
visual comparison of the extent of enzymatic reaction of a 
substrate in comparison with similarly prepared standards. 
Detection can also be accomplished using any of a 

30 variety of other immunoassays. For example, by 

radioactively labeling the antibodies or antibody fragments, 
it is possible to detect fingerprint gene wild type or 
mutant peptides through the use of a radioimmunoassay (RIA) 
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(see, for example, Weintraub, B., Principles of 
Radioimmunoassays, Seventh Training Course on Radioligand 
Assay Techniques, The Endocrine Society, March, 1986, which 
is incorporated by reference herein) . The radioactive 
5 isotope can be detected by such means as the use of a gamma 
counter or a scintillation counter or by autoradiography . 

It is also possible to label the antibody with a 
fluorescent compound. When the fluorescent ly labeled 
antibody is exposed to light of the proper wave length, its 
10 presence can then be detected due to fluorescence. Among 
the most commonly used fluorescent labeling compounds are 
fluorescein isothiocyanate, rhodamine, phycoerythrin, 
phycocyanin, allophycocyanin, o-phthaldehyde and 
f luorescamine . 

15 The antibody can also be detectably labeled using 

fluorescence emitting metals such as 152 Eu, or others of the 
lanthanide series. These metals can be attached to the 
antibody using such metal chelating groups as 
diethylenetriaminepentacetic acid (DTPA) or 

20 ethylenediaminetetraacetic acid (EDTA) . 

The antibody also can be detectably labeled by 
coupling it to a chemiluminescent compound. The presence of 
the chemiluminescent- tagged antibody is then determined by 
detecting the presence of luminescence that arises during 

25 the course of a chemical reaction. Examples of particularly 
useful chemiluminescent labeling compounds are luminol, 
isoluminol, theromatic acridinium ester, imidazole, 
acridinium salt and oxalate ester. 

Likewise, a bioluminescent compound can be used to 

30 label the antibody of the present invention. 

Bioluminescence is a type of chemiluminescence found in 
biological systems in, which a catalytic protein increases 
the efficiency of the chemiluminescent reaction. The 



presence of a bioluminescent protein is determined by 
detecting the presence of luminescence. Important 
bioluminescent compounds for purposes of labeling are 
luciferin, luciferase and aequorin. 

6. EXAMPLE: IDENTIFICATION AND 

CHARACTER I Z AT I ON OP A NOVEL GENE 
THAT INHIBITS TUMOR PROGRESSION 

In the Example presented in this Section, the in 

vitro paradigm, described, above, in Section 5.1.1.1, was 

utilized to identify a gene, designated herein as the 030 

gene, which is differentially expressed in cells with a high 

metastatic potential relative to cells having a low 

metastatic potential. Specifically, the 030 gene is 

expressed in high metastatic potential cells at a rate which 

is many-fold lower than it is expressed in non-metastatic 

cells. Thus, as discussed below, the 030 gene can encode a 

product important to a number of neoplastic processes, 

including, for example, the progression of a cell to a 

metastatic state, the aggressiveness of a cell's metastatic 

state, and the ability of a primary tumor cell to invade 

surrounding tissue. Given the differential 030 gene 

expression pattern revealed in this Section, the 030 gene 

product can represent a protein having tumor suppressor or 

inhibitor function. 

6.1. MATERIALS AND METHOD 8 
6.1.1. CELL CULTURE 
B16 Fl and B16 F10 melanoma cell lines were 
maintained in culture in Eagle's minimal essential medium 
(MEM) supplemented with 10% fetal calf serum. Cells were 
harvested from nonconfluent monolayers by a two minute 
treatment with 0.25% trypsin and 2mM EDTA. 



For further characterization of in vivo activity, 
each cell line was injected into mice* Cells were washed 
two times in MEM, and the final cell suspension adjusted to 
5 x 10 5 cells per ml in MEM. Two hundred microliters of 
5 this cell suspension (1 x 10 5 cells) was injected i.v. into 
the lateral tail vein of C57BL/6J mice. After three weeks, 
the mice were sacrificed and their lungs autopsied. The 
number of pulmonary tumors was determined by counting 
surface nodules using a dissecting microscope. 
10 The differential expression of the 030 gene in B16 

Fl relative to B16 F10 cell lines was compared with the 
extent of pulmonary metastases which developed in B16 Fl- 
injected mice relative to B16 FlO-injected mice. 

6,1. 2. Differential Display 

15 Differential mRNA display was carried out as 

described, above, in Section 5.1.1.2. Details of the 
differential display are given, below. 
RNA isolation 

RNA was isolated, using RNAzol, from nonconfluent 

20 monolayers of B16 Fl and B16 F10 cell lines. 

Isolated RNA was resuspended in DEPC H 2 0 and 
quantitated by spectrophotometry at OD 260 « Approximately 
half of the RNA samples were then treated with DNAse I to 
remove contaminating chromosomal DNA. Each 50 Ml RNA sample 

25 (50 ng) , 5.7 /il 10x PCR buffer (Perkin-Elmer/Cetus) and 1 Ml 
RNAse inhibitor (40 units/Ml; Boehringer Mannheim, Germany) 
were mixed together. Two microliters of DNAse I (10 
units/Ml; Boehringer Mannheim) was added to the reaction 
which was incubated for 30 min. at 37°C. The total volume 

30 was brought to 200 /il with DEPC H 2 0, extracted once with 

phenol/chloroform and precipitated by adding 20 ^1 3M NaOAc r 
pH 4.8, ( DEPC- tr eat ed ) , 500 Ml absolute ETOH and incubated 



for 1 hour on dry ice. The precipitated sample was 
centrifuged for 15 min., and the pellet was washed with 70% 
ETOH. The sample was re-centrifuged, the remaining liquid 
was aspirated, and the pellet was resuspended in 50 jxl H 2 0. 
5 The concentration of RNA was measured by reading the OD 260 . 
First strand cDNA synthesis 

For each RNA sample, duplicate reverse transcription 
reactions were carried out in parallel. Four hundred ng RNA 
plus DEPC H 2 0 in a total volume of 10 Ml were added to 4 /xl 
10 T^CC 3' primer (10 pM; Operon) . The mixture was incubated 
at 70 °C for 5 min. to denature the RNA and then placed at 
room temperature. Twenty-six Ml of reaction mix containing 
the following components was added to each denatured 
ffil RNA/primer sample: 8 Ml 5x First Strand Buffer (Gibco/BRL, 

Jj 15 Gaithersburg, MD) , 4 Ml 0.1M DTT (Gibcol/BRL) , 2 /il RNAse 
h inhibitor (40 units/Ml) (Boehringer Mannheim) , 4 Ml 200 ptt 

£ dNTP mix, 6 pi H 2 0, 2 Ml Superscript reverse transcriptase 

jpT« (200 units/Ml; Gibcol/BRL). The reactions were mixed gently 

s and incubated for 30 min. at 42 °C Sixty Ml of H 2 0, for a 

H 20 final volume of 100 Ml* was then added and the samples were 
f|j denatured for 5 min. at 85 °C and stored at -20 °C. 

W pcr reactions 

%j The resulting single stranded cDNA molecules were 

then amplified by PCR. Specifically, 13 Ml of reaction mix 

25 was added to each tube of a 96 well plate on ice. The 
reaction mix contained 6.4 ^ H- 2 0, 2 Ml 10x PCR Buffer 
(Perkin-Elmer) , 2 Ml 20 MM dNTPs, 0.4 Ml 35 S dATP (12.5 
MCi/Ml; 50 MCi total; Dupont/NEN) , 2 Ml 5' primer OPE4 
(5'GTGACATGCC-3' ; 10 MM; Operon), and 0.2 Ml AmpliTaq* 

30 Polymerase (5 units/Ml; Perkin-Elmer). Next, 2 m! of 3' 
primer (T X1 CC, 10 pM) were added to the side of each tube, 
followed by 5 Ml of cDNA, also to the sides of the tubes, 
which were still on ice. Tubes were capped and mixed, and 



brought up to 1000 rpm in a centrifuge, then immediately 
returned to ice. A Perkin-Elmer 9600 thermal cycler was 
used, and programmed as follows: 



94°C 2 min. 

5 *94°C 15 sec. * = x 40 

*40°C 2 min. 

*ramp 72 °C 1 min. 

*72°C 30 sec. 

72°C 5 min. 

10 4°C hold 

When the thermal cycler initially reached 94°C, the 
96 well plate was removed from ice and placed directly into 
the cycler. Following the amplification reaction, 15 pi of 
loading dye, containing 80% formamide, 10 mM EDTA, 1 mg/ml 

15 xylene cyanole, 1 mg/ml bromphenol blue were added. The 

loading dye and reaction were mixed, incubated at 85 °C for 5 
min., cooled on ice, centrifuged, and placed on ice. 
Approximately 4 pi from each tube was loaded onto a pre-run 
(60V) 6% denaturing acrylamide gel. The gel was run at 

20 approximately 80V until top dye front was about 1 inch from 
bottom. The gel was transferred to 3 MM paper (Whatman 
Paper, England) and dried under vacuum. Bands were 
visualized by autoradiography. 

6.I.3. OTHER TECHNIQUES 
25 Amplified cDNA Band Isolation and A mplification 

PGR bands determined to be of interest in the 
differential display analysis were recovered from the gel 
and reamplified. 

Briefly, differentially expressed bands were excised 
30 from the dried gel with a razor blade and placed into a 
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microfuge tube with 100 /xl H 2 0 and heated at 100 °C for 5 
min., vortexed, heated again to 100°C for 5 min., and 
vortexed again. After cooling, 100 /il H 2 0, 20 ^1 3M NaOAc, 
1 /il glycogen (20 mg/ml) , and 500 jul ethanol were added and 
5 the sample was precipitated on dry ice. After 

centr if ligation, the pellet was washed and resuspended in 10 
/xl H 2 0. 

DNA isolated from the excised differentially 
expressed bands were then reamplified by PCR using the 
10 following reaction conditions: 



58 


Ml 


H 2 0 




10 


Ml 


lOx PCR Buffer (see above) 


10 


Ml 


200 MM dNTPS 




10 


Ml 


10 jiM 3' primer (see 


above) 


10 


Ml 


10 /iM 5' primer (see 


above) 


1.5 


Ml 


amplified band 




0.5 


Ml 


AMPLITAQ® polymerase 


(5 units//il; 



(Perkin Elmer) 

PCR conditions were the same as the initial 
20 conditions used to generate the original amplified band, as 
described, above. After reamplif ication, glycerol loading 
dyes were added and samples were loaded onto a 2% 
preparative TAE/Biogel (BiolOl, La Jolla, CA) agarose gel 
and eluted. Bands were then excised from the gel with a 
25 razor blade and vortexed for 15 min. at r.t., and purified 
using the MERMAID 111 kit from BiolOl by adding 3 volumes of 
MERMAID 1 * high salt binding solution and 8 /xl of resuspended 
glassfog in a microfuge tube. Glassfog was then pelleted, 
washed 3 times with ethanol wash solution, and then DNA was 
30 eluted twice in 10 pi at 50 °C. 
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flubclenina and Sequencing 

The TA cloning kit (Invitrogen, San Diego, CA) was 
used to subclone the amplified bands. The ligation reaction 
typically consisted of 4 Ml sterile H 2 0, 1 Ml ligation 
5 buffer, 2 Ml TA cloning vector, 2 Ml PCR product, and 1 Ml 
T4 DNA ligase. The volume of PCR product can vary, but the 
total volume of PCR product plus H 2 0 was always 6 m1« 
Ligations (including vector alone) were incubated overnight 
at 12 °C before bacterial transformation. TA cloning kit 

10 competent bacteria (INVaF': endal, recAl, hsdR!7(r-k, m+k) , 
supE44 m thirl, crvrA , relAl, <j>801acZaAM15A (lacZYA-arqF) , 

deoR+, F') were thawed on ice and 2 /il of 0,5 M j5- 
mercaptoethanol were added to each tube. Two Ml from each 
ligation were added to each tube of competent cells (50 Ml) / 

15 mixed without vortexing, and incubated on ice for 30 min. 
Tubes were then placed in 42 °C bath for exactly 30 sec, 
before being returned to ice for 2 min. Four hundred-fifty 
Ml of SOC media (Sambrook et al. , 1989, supra ) were then 
added to each tube which were then shaken at 37 °C for 1 hr. 

20 Bacteria were then pelleted, resuspended in approximately 
200 Ml SOC and plated on Luria broth agar plates containing 
X-gal and 60 &g/ml ampicillin and incubated overnight at 
37 °C. White colonies were then picked and screened for 
inserts using PCR. 

25 A master mix containing 2 /il 10 x PCR buffer, 1.6 Ml 

2.5 mM dNTP's, 0.1 Ml 25 mM MgCl 2 , 0.2 Ml M13 reverse primer 
(100 ng/Ml)/ 0.2 /il M13 forward primer (100 ng/Ml) / 0.1 Ml 
AmpliTaq* (Perkin-Elmer) , and 15.8 Ml H 2 0 was made. Forty 
Ml of the master mix were aliquoted into tubes of a 96 well 

30 plate, and whole bacteria were added with a pipette tip 

prior to PCR. The thermal cycler was programmed for insert 
screening as follows: 
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94°C 2 min. 

*94°C 15 sec. * = x 35 

*47°C 2 min. 

*ramp 72°C 30 sec. 

*72°C 30 sec. 

72°C 10 min. 

4°C hold 



Reaction products were eluted on a 2% agarose gel 
and compared to vector control. Colonies with vectors 
containing inserts were purified by streaking onto LB/ Amp 
plates. Vectors were isolated from such strains and 
subjected to sequence analysis, using an Applied Biosystems 
Automated Sequencer (Applied Biosystems, Inc. Seattle, WA) . 
cloning of Human Gene 

A human retina cDNA library obtained from Clontech 
was screened using the entire mouse fomy030 cDNA (FIGS. 3A 
and 3B) as a probe. During this screen, one million library 
phage were screened, 53 of which were found to hybridize 
with the mouse fomy030 probe. The cDNA inserts for eight of 
these positives were isolated, subcloned, and sequenced. 

Comparison of the murine fomy030 and human fohy030 
sequences demonstrated a high degree of sequence similarity 
(86% identical at the nucleotide level and 94.4% identical 
at the amino acid level) within the 5', 1813 base pairs of 
their cDNAs . However, beyond this point the sequences 
diverge and share no significant similarity. The sequence 
of fomy030 at the point of divergence is GTAG, which 
corresponds to a consensus splice donor site. 

Three independent library isolated cDNAs, as well as 
a cDNA isolated as a 3' RACE product were found to contain 
the fomy030 sequence. Thus, the most probable explanation 
for the divergence of the human and murine sequences is the 
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existence of alternate splice forms of the fomy030 and 
fohy030 transcripts. The fomy030 splice version results in 
a protein product of 542 amino acids in length, while the 
fohy030 splice variant is predicted to encode a protein of 
5 1497 amino acids in length (FIG. 5) . 

Another splice variant is shown in FIG. 6 (SEQ ID 
N0:8), and encodes a protein of 1533 amino acids in length 
(SEQ ID NO: 9). The cDNA of FIG. 5 (SEQ ID NO: 6) is missing 
34 nucleotides beginning after 2879 in SEQ ID NO: 8, and is 

10 missing 74 nucleotides beginning after 2926 in SEQ ID NO: 8. 
Thus, nucleotides 2880-2892 in SEQ ID NO: 6 are identical to 
nucleotides 2914-2926 in SEQ ID NO: 8, and the sequences are 
essentially identical starting at 2893 in SEQ ID NO: 6 and 
3001 in SEQ ID NO: 8. The difference in the respective amino 

15 acid sequences is that the amino acids are identical from 1 
to 844, and then again from 850 to 1497 in SEQ ID NO: 7 and 
from 886 to 1533 in SEQ ID NO: 9. 

Within their common 5' sequences, fohy030 was also 
found to have an additional three base pairs (GGA) inserted 

20 after position 1394 in the mouse cDNA (at positions 1066- 
1068 in FIGS. 5 and 6) . These additional three base pairs 
fall within the open reading frames of both fohy030 and 
fomy030, and result in an additional Glycine residue at 
position 356 within the open reading frame of fohy030 

25 relative to fomy030. 
Northern Analysis 

Northern analysis was performed to confirm the 
differential expression of the genes corresponding to the 
amplified bands, as described below. 

30 Twelve micrograms of total RNA sample, 1.5 x RNA 

loading dyes (60% formamide, 9% formaldehyde, 1.5 x MOPS, 
.075% x C/BPB dyes) at a final concentration of 1 x and H 2 0 
to a final volume of 40 /xl were mixed. The tubes were 



heated at 65 °C for 5 min. and then cooled on ice. The RNA 
samples analyzed were loaded onto a denaturing 1% agarose 
gel. The gel was run overnight kt 32V in 1 x MOPS buffer. 
A 300 ml denaturing 1% agarose gel was made as 
5 follows. Three grams of agarose (SeaKem™ LE, FMC 

BioProducts, Rockland, ME) and 60 ml of 5 x MOPS buffer 
(0.1M MOPS [pH 7.0], 40 mM NaOAc , 5mM EDTA [pH 8.0]) were 
added to 210 ml sterile H 2 0. The mixture was heated until 
melted, then cooled to 50 °C, at which time 5 fil ethidium 

10 bromide (5 mg/ml) and 30 ml of 37% formaldehyde were added 
to the melted gel mixture. The gel was swirled quickly to 
mix, and then poured immediately. 

After electrophoresis, the gel was photographed with 
a fluorescent ruler, then was washed three times in DEPC 

15 H 2 0, for 20 minutes per wash, at room temperature, with 
shaking. The RNA was then transferred from the gel to 
Hybond-N® membrane (Amersham) , according to the methods of 
Sambrook et al. , 1989, supra, in 20 x SSC overnight. 

The probes used to detect mRNA were typically 

20 synthesized as follows: 2 Ml amplified cDNA band (-30 ng) , 
7 fil H 2 0, and 2 Ml 10 x Hexanucleotide mix (Boehringer- 
Mannheim) were mixed and heated to 95°C for 5 min., and then 
allowed to cool on ice. The volume of the amplified band 
can vary, but the total volume of the band plus H 2 0 was 

25 always 9 Ml* 3 Ml dATP/dGTP/dTTP mix (1:1:1 of 0.5 mM 

each), 5 Ml <* 32 I> dCTP 3000 Ci/mM (50 MCi total; Amersham, 
Arlington Heights, IL) , and 1 Ml Klenow (2 units; 
Boehringer-Mannheim) were mixed and incubated at 37 °C. 
After 1 hr., 30 Ml TE were added and the reaction was loaded 

30 onto a Biospin-6* column (Biorad, Hercules, CA) , and 

centrifuged. A 1 Ml aliquot of eluate was used to measure 
incorporation in a scintillation counter with scintillant to 
ensure that 10 6 cpm/Ml of incorporation was achieved. 



For pre-hybridization, the blot was placed into a 
roller bottle containing 10 ml of rapid-hyb solution 
(Amersham) , and placed into 65 °C incubator for at least 1 
hr. For hybridization, 1 x 10 7 cpm of the probe was then 
5 heated to 95°C, chilled on ice, and added to 10 ml of rapid- 
hyb solution. The prehybridization solution was then 
replaced with probe solution and incubated for 16 hours at 
65°C. The following day, the blot was washed once for 20 
min. at room temperature in 2 x SSC/0.1% SDS and twice for 

10 15 min. at 65 °C in 0.1 x SSC/0.1% SDS before being covered 
in plastic wrap and put down for exposure. 
In Situ Hybridization 

10 fim sections of formalin fixed/paraffin embedded 
benign nevi (non-metastic growths of melanocytes) and 

15 malignant melanoma were post-fixed with 4% PFA/PBS for 15 
minutes. After washing with PBS, sections were digested 
with 21 /xg/ml proteinase K at 37 °C for 15 minutes, and again 
incubated with 4% PFA/PBS for 10 minutes. Sections were 
then washed with PBS, incubated with 0.2 N HC1 for 10 

20 minutes, washed with PBS, incubated with 0.25% acetic 

anhydride/ 1 M triethanolamine for 10 minutes, washed with 
PBS, and dehydrated with 70% ethanol and 100% ethanol. 

Hybridizations were performed with 35 S-radiolabeled 
(5 x 10 7 cpm/ml) cRNA probes encoding a 1.1 kB segment of 

25 the coding region of the human cDNA (clone fohy030) , and a 1 
kB segment of the coding region of the human H4 histone gene 
in the presence of 50% formamide, 10% dextran sulfate, 1 x 
Denhardt's solution, 600 mM NaCl, 10 mM DTT, 0.25% SDS, and 
100 /ig/ml tRNA for 18 hours at 55°C. The H4 histone gene 

30 was used as a control to show proper transcription of RNA. 

After hybridization, slides were washed with 5 x SSC 
at 55°C, 50% formamide/2 x SSC at 55°C for 30 minutes, 10 mM 
Tris-HCl(pH 7.6)/500 mM NaCl/1 mM EDTA (TNE) at 37°C for 10 



minutes, washed in TNE at 37 °C for 10 minutes, incubated 
once in 2 x SSC at 50 °C for 30 minutes, twice in 0.2 x SSC 
at 50 °C for 30 minutes, and dehydrated with 70% ethanol and 
100% ethanol. Localization of mRNA transcripts was detected 
by dipping slides in Kodak NBT-2 photo-emulsion and exposing 
for 4 days at 4°C. Controls for the in situ hybridization 
experiments included the use of a sense probe which showed 
no signal above backgrounds levels. 

6.2. RESULTS 
An in vitro paradigm, as described, above, in 
Section 5.1.1.1, was carried out using the melanoma cell 
lines, B16 Fl and B16 F10. The B16 Fl cell line exhibits a 
low metastatic potential, while the B16 F10 cell line 
exhibits a high metastatic potential. Thus, the two cell 
lines were grown in vitro as described in Section 6.1.1, RNA 
was isolated from these cells and differential display 
carried out as described in Section 6.1. 

The differential display analysis identified a band, 
designated romy030, which represents a cDNA derived from RNA 
produced by a gene which was expressed at a much higher 
level in the B16 Fl cells, i.e. . the low metastatic 
potential cells, relative to the gene's expression in B16 
F10 cells, i.e. . high metastatic potential cells. The gene 
corresponding to the romy030 band is referred to herein as 
the fomy030 or 030 gene. 

The amplified romy030 band was isolated, 
reamplified, subcloned, and sequenced, as described, above, 
in Section 6.1.3. The romy030 nucleotide sequence (SEQ ID 
NO:l) is shown in FIG. 2. 

A BLAST (Altschul, S.F. et al., 1990, J. Mol. Biol. 
215:403-410) database search with the romy030 nucleotide 
sequence revealed no sequences within the database which are 
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similar to that of romy030. Thus, 030, the gene 
corresponding to romy03 0, appears to represent a novel, 
previously unknown gene which is differentially expressed in 
cells exhibiting a low metastatic potential relative to 
5 those cells exhibiting a high metastatic potential. 

To confirm this putative differential regulation, 
amplified romy030 cDNA was used to probe Northern RNA blots 
containing RNA from B16 Fl and B16 F10 cells. FIG. 1 shows 
the results of one such Northern blot analysis, in which it 

10 is demonstrated that the steady state levels of fomy030 mRNA 
are significantly higher in the low metastatic potential 
cells ( i.e. . the B16 Fl cells) relative to the high 
metastatic potential cells ( i.e. , B16 F10 cells) . Lanes 1 
and 3 represent Fl cells and Lanes 2 and 4 represent F10 

15 cells respectively. Thus, this Northern analysis confirmed 
the putative differential fomy030 regulation which had been 
suggested by the differential display results. 

Two specific oligonucleotides were generated based 
on the sequence of romy030, romy030U 5'- 

20 GGGGAAGCACATCAAGGAAC-3' (SEQ ID NO: 4) and romy030L 5'- 
GCAACTACACTCGGAAAAGC-3 ' (SEQ ID N0:5), for use in PCR 
reactions. cDNA libraries prepared from mRNA isolated from 
normal melanocytes and a mouse melanoma cell line were 
analyzed for the presence of fomy030 by PCR, utilizing the 

25 above romy030 probes. Fomy030 was detected in the 

melanocyte library but not in the melanoma library. The 
melanoma library was generated from a highly metastatic 
mouse melanoma K-1735 m2. This result is consistent the 
observation that fomy030 is present at reduced levels in the 

30 metastatic B16 F10 melanoma cell line. A radioactive DNA 
probe was generated from the subcloned romy030 DNA. This 
probe was used to screen the normal mouse melanocyte cDNA 
library. Three independent positive clones were identified 



and isolated during this screening. These clones were 
designated fomy030a, fomy030b, and fomy030c. These cDNAs 
were sequenced and the overlapping portions were found to be 
identical. The nucleotide sequence of all three fomy030 
cDNAs, designated as the fomy030 sequence (SEQ ID NO: 2) is 
depicted in FIGS. 3A and 3B, and contains the sequence of 
romy030. The findings described herein suggest a novel role 
for fomy030 in tumor progression. A down-regulation of 030 
can be used as a diagnostic marker for tumor progression, 
especially for the progression to metastasis. Further, 030 
gene products can be used in the prevention and treatment of 
tumor progression disorders. 
Fohv030 Expression in Human Tissue S amples 

To determine whether the fohy030 gene product is 
differentially expressed in clinically relevant human 
disease, fohy030 gene expression was analyzed in biopsy 
sections of human benign nevi (non-metastic growths of 
melanocytes) and malignant melanoma using in situ 
hybridization. Fohy030 expression was detected in small 
intermittent cells in the basal layer of the epidermis 
(likely, melanocytes) and in the majority of nevus cells in 
patients diagnosed with benign nevi. No expression of 
fohy030 was detected in the majority of melanoma cells in 
patients diagnosed with metastatic melanoma, though 
expression was detected in normal melanocytic cells in the 
same tissue section. These results show that the fohy030 
gene product is associated with metastasis suppression. 

6.3. 030 GENE EXPRESSION IS INVERSELY 

CORRELATED WITH METASTATIC PO TENTIAL 

6.3.1. EXPERIMENTAL PROTOCOLS AND RE8ULTS 

The relationship between 030 gene expression and 
tumor progression was confirmed as described herein. 
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Specifically, the metastatic potentials of six variants of 
the B16 cell line were tested in animals and the metastatic 
potential was compared to the level of 030 gene expression 
observed within the cell variants. 

A single cell suspension of B16 Fl cells (low 
metastatic potential) was injected intravenously into 
syngeneic C57BL/6 mice. After three weeks, lung tumors were 
excised and seeded into tissue culture. The following six 
cell lines were grown in culture: B16 61, B16 G2, B16 G3, 
B16 G4, B16 G9 and B16 G12. 

To test the metastatic ability of the above listed 
six tumor cell lines, the same number of cells for each of 
the six cell lines intravenously into different groups of 
syngeneic C57BL/6 mice. Three weeks later, the mice were 
killed and the lungs were removed aseptically. 
Significantly more number of tumors were observed in mice 
injected with the following three cell lines: B16 G4, B16 
G9 and B16 G12. These results demonstrate that the B16 G4, 
B16 G9 and B16 G12 cell lines have high metastatic potential 
and the B16 Gl, B16 G2 and B16 G3 cell lines have low 
metastatic potential. 

The lung tumors produced from these three highly 
metastatic cell lines (B16 G4, B16 G9 and B16 G12) were then 
excised and seeded into tissue culture to produce the 
following four cell lines: B16 H5, B16,H6, B16 H7 and B16 
H8. 

Northern analysis was performed to determine the 
expression of 030 gene in the above listed cell lines ( i t e t , 
B16 H5, B16,H6, B16 H7 and B16 H8) using procedures 
described above in Section 6.1.3. Figure 4 shows the 
results of one such Northern blot analysis, in which it is 
demonstrated that the steady state levels of 030 mRNA are 
significantly lower in the highly metastatic cells (j.e t , 
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B16 H5, B16,H6, B16 H7 and B16 H8) relative to the B16 Fl 
low metastatic potential cells. Lane 1 represents the B16 
Fl cells, lane 2 is B16 F10 metastatic cells and lanes 3-6 
represent B16 H5, B16,H6, B16 H7 and B16 H8. 

Thus, this Northern analysis confirmed the initial 
finding in this invention that 030 expression is inversely 
related to the metastatic potential of tumor cells and 
supports the theory that the 030 gene product plays a role 
in inhibiting tumor progression, including the progression 
to a high metastatic potential state. In this regard, it is 
important to note that the tumor cell number and 
homogeneity, and the syngeneic recipient did not change from 
one cell line to another in the above protocols. Therefore, 
the differences in metastatic incidence can only be 
attributed to properties intrinsic to the various cell lines 
used. The clonal selection of tumors from successive 
metastases results in cells better capable of survival, 
formation and progression of tumor foci in the lung. This 
indicates that the decrease in expression of 030 observed in 
the highly metastatic four cell lines (i.e t , B16 H5, B16,H6, 
B16 H7 and B16 H8) is an intrinsic property of these cell 
lines and is related to the development, progression and 
metastatic potential of the tumor cells. 

7. EXAMPLE: USE OF FINGERPRINT GENES 

AS SURROGATE MARKE RS IN CLIMTCM. TRIALS 

The expression pattern of the fingerprint genes of 
the invention may be utilized as surrogate markers to 
monitor clinical human trials of drugs being tested for 
their efficacy as tumor progression treatments, or may, 
additionally, be used to monitor patients undergoing 
clinical evaluation for the treatment of tumor progression. 
"Fingerprint gene," as used herein is defined as in Section 
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3, above. Individual fingerprint gene expression patterns 
may be analyzed or, alternatively, fingerprint patterns may 
be analyzed. "Fingerprint pattern," as used herein is 
defined as in Section 3, above. 

The effect of the compound on the fingerprint gene 
expression normally displayed in connection with a disorder 
involving tumor progression can be used to evaluate the 
efficacy of the compound as a treatment for such a disorder. 
Additionally, fingerprint gene expression can be used to 
monitor patients undergoing clinical evaluation for the 
treatment of the disorder. 

According to the invention, the fingerprint gene 
expression and fingerprint pattern derived from any of the 
paradigms described in Section 5.1.1.1 can be used to 
monitor clinical trials of drugs in human patients. The 
paradigms described in Section 5.1.1.1, and illustrated in 
the Example presented in Section 6, above, for example, 
provide the fingerprint pattern of B16 melanoma cells. This 
profile gives an indicative reading, therefor, of the 
metastatic and non-metastatic states of melanoma cells. 
Accordingly, the influence of anticancer chemotherapeutic 
agents on the melanoma cells can be measured by performing 
differential display on melanoma cells of patients 
undergoing clinical tests. 

7.1. TREATMENT OF PATIENTS AND PROCUREMENT 
OP TUMOR CELLS OR BI0P8IE8 

Test patients can be administered compounds 
suspected of antimetastatic activity. Control patients can 
be given a placebo. 

Tumor cells or biopsies can be drawn from each 
patient after a determined period of treatment and RNA can 
be isolated as described in Section 6.6.1, above. 
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7.2. ANALYSI8 OF SAMPLES 
RNA can be subjected to differential display 
analysis as described in Section 6.6.1, above. A decrease 
in the metastatic potential of tumor cells is indicated by 
an increase in the intensity of the romy030 band, as 
described in Section 6.2, above. 

8. DEPOSIT Qg MICROORGANISMS 
The following microorganism was deposited with the 
Agricultural Research Service Culture Collection (NRRL) , 
Peoria, Illinois, on March 3, 1995 and assigned the 
indicated accession number: 

Microorganism NRRL Acc ession No. 

E. coli B-21416 

OTHER EMBODIMENTS 

The present invention is not to be limited in scope 
by the specific embodiments described which are intended as 
single illustrations of individual aspects of the invention 
and functionally equivalent methods and components are 
within the scope of the invention, in addition to those 
shown and described herein will become apparent to those 
skilled in the art from the foregoing description and 
accompanying drawings. Such modifications are intended to 
fall within the scope of the appended claims. 



174591. Bll 



- 129 - 



SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT! Shy j an, Andrew W. 

(ii) TITLE OP INVENTION: COMPOSITIONS AND METHODS FOR THE 

DIAGNOSIS, PREVENTION AND TREATMENT OF TUMOR 
PROGRESSION 

(iii) NUMBER OF SEQUENCES: 9 

<iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSES: Fish & Richardson P.C. 

(B) STREET: 225 Franklin Street 

(C) CITY: Boston 

(D) STATE: MA 
(B) COUNTRY: USA 
(F) ZIP: 02110-2804 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patent In Release #1.0, Version #1.30 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: US 

(B) FILING DATE: 29-MAR-1996 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/412,431 

(B) FILING DATE: 29-MAR-1995 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Fasse, J. Peter 

(B) REGISTRATION NUMBER: 32,983 

(C) REFERENCE /DOCKET NUMBER: 07334/004001 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 617/542-5070 

(B) TELEFAX: 617/542-8906 

(C) TELEX: 200154 

(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 186 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDSDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 
GGTGCTGGAG TACCTCATGG GCGGTGCCTA CCGCTGCAAC TACACTCGGA AAAGCTTCCG 60 
GACTCTCTAC AACAACTTGT TTGGCCCTAA GAGGGTAGAG CTCAGCAGAC ACACAGTGTC 120 
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CTGTGCCTCC CAGAGTAACA TGTGGTTCCT TGATGTGCTT CCCCAAAAGC CCACCTGTGC 180 
AGAATG 186 
(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2729 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

AAGGAGGCTA GGCTGCACCC TTCCCGCTTG CTCAGCAGCT GAGGCAGGGT CAGAAAGCAT 60 

GGATAGAGAA GACATTTTGC AAAAGGGAAT GCATCTTTGT AATTCCCAGT ACAAAAGACC 120 

CTAACAGATG TTGCTGTGGT CAGCTCACTA ACCAGCACAT CCCCCCTTTG CCGAGTGGGG 180 

CTCCCAGCAC AACAGGAGAG GACACCAAGC AGGCAGACAC GCAGTCCGGG AAATGGTCTG 240 

TCAGCAAACA CACCCAGAGC TACCCAACAG ACTCCTATGG GATTCTTGAA TTCCAGGGTG 300 

GGGGTTACTC CAATAAAGCC ATG TAC ATC CGA GTC TCC TAC GAC ACC AAG 350 

Met Tyr lie Arg Val Ser Tyr Asp Thr Lys 
15 10 

CCA GAT TCC CTG CTC CAC CTC ATG GTG AAG GAC TGG CAG CTG GAG CTC 398 
Pro Asp Ser Leu Leu His Leu Met Val Lys Asp Trp Gin Leu Glu Leu 
15 20 25 

CCG AAG CTC TTG ATA TCT GTG CAC GGA GGC CTC CAA AGC TTC GAG ATG 446 
Pro Lys Leu Leu lie Ser Val His Gly Gly Leu Gin Ser Phe Glu Met 
30 35 40 

CAG TCC AAA CTG AAG CAG GTG TTT GGG AAA GGT CTG ATC AAG GCT GCC 494 
Gin Ser Lys Leu Lys Gin Val Phe Gly Lys Gly Leu lie Lys Ala Ala 
45 50 55 

ATG ACC ACG GGG GCG TGG ATC TTC ACC GGG GGT GTG AGC ACT GGT GTC 542 
Met Thr Thr Gly Ala Trp lie Phe Thr Gly Gly Val Ser Thr Gly Val 
60 65 70 

GTC AGC CAT GTG GGG GAT GCC TTG AAA GAC CAC TCC TCC AAG TCC AGA 590 
Val Ser His Val Gly Asp Ala Leu Lys Asp His Ser Ser Lys Ser Arg 
75 80 85 90 

GGC CGG CTC TGT GCT ATA GGA ATT GCT CCC TGG GGC ATG GTG GAG AAC 638 
Gly Arg Leu Cys Ala lie Gly lie Ala Pro Trp Gly Met Val Glu Asn 
95 100 105 

AAG GAA GAC CTG ATT GGA AAA GAT GTA ACA AGA GTC TAT CAG ACC ATG 686 
Lys Glu Asp Leu lie Gly Lys Asp Val Thr Arg Val Tyr Gin Thr Met 
110 115 120 

TCC AAC CCT CTG AGC AAG CTC TCT GTG CTC AAC AAT TCC CAC ACT CAC 734 
Ser Asn Pro Leu Ser Lys Leu Ser Val Leu Asn Asn Ser His Thr His 
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125 130 135 

TTC ATC TTG OCT GAC AAC GGC ACC CTG GGC AAG TAT GGT GCT GAG GTG 782 
Phe lie Leu Ala Asp Asn Gly Thr Leu Gly Lys Tyr Gly Ala Glu Val 
140 145 150 

AAG CTT CGA AGA CAG CTG GAA AAA CAC ATC TCC CTG CAG AAG ATC AAC 830 
Lys Leu Arg Arg Gin Leu Glu Lys His lie Ser Leu Gin Lys lie Asn 
155 160 165 170 

ACA AGG CTG GGC CAG GGT GTA CCT GTC GTG GGC CTA GTG GTA GAA GGT 878 
Thr Arg Leu Gly Gin Gly Val Pro Val Val Gly Leu Val Val Glu Gly 
175 180 185 

GGT CCT AAC GTG GTT TCT ATC GTC CTG GAG TAT CTC AAA GAA GAC CCT 926 
Gly Pro Asn Val Val Ser lie Val Leu Glu Tyr Leu Lys Glu Asp Pro 
190 195 200 

CCT GTC CCT GTG GTG GTT TGC GAT GGC AGT GGA CGT GCC TCT GAC ATT 974 
Pro Val Pro Val Val Val Cys Asp Gly Ser Gly Arg Ala Ser Asp lie 
205 210 215 

TTG TCC TTC GCA CAC AAA TAC TGC GAC GAA GGA GGA GTC ATA AAC GAG 1022 
Leu Ser Phe Ala His Lys Tyr Cys Asp Glu Gly Gly Val lie Asn Glu 
220 225 230 

TCC CTG CGG GAC CAG CTT CTA GTT ACC ATT CAG AAA ACA TTT AAT TAC 1070 
Ser Leu Arg Asp Gin Leu Leu Val Thr lie Gin Lys Thr Phe Asn Tyr 
235 240 245 250 

AGC AAG TCC CAG TCG TAT CAG CTG TTT GCA ATT ATC ATG GAG TGC ATG 1118 
Ser Lys Ser Gin Ser Tyr Gin Leu Phe Ala He lie Met Glu Cys Met 
255 260 265 

AAG AAG AAA GAA CTC GTC ACT GTG TTT CGG ATG GGT TCC GAG GGT CAG 1166 

Lys Lys Lys Glu Leu Val Thr Val Phe Arg Met Gly Ser Glu Gly Gin 
270 275 280 

CAA GAT GTC GAG ATG GCA ATT TTA ACT GCC TTG CTC AAA GGA ACC AAC 1214 
Gin Asp Val Glu Met Ala lie Leu Thr Ala Leu Leu Lys Gly Thr Asn 
285 290 295 

GCA TCA GCT CCA GAT CAG CTG AGC TTG GCC CTG GCT TGG AAC CGG GTC 1262 
Ala Ser Ala Pro Asp Gin Leu Ser Leu Ala Leu Ala Trp Asn Arg Val 
300 305 310 

GAC ATA GCG CGA AGC CAG ATC TTC GTC TTT GGC CCA CAC TGG CCG CCA 1310 
Asp lie Ala Arg Ser Gin He Phe Val Phe Gly Pro His Trp Pro Pro 
315 320 325 330 

CTG GGA AGC CTG GCC CCT CCT GTG GAC ACC AAA GCC GCA GAG AAG GAA 1358 
Leu Gly Ser Leu Ala Pro Pro Val Asp Thr Lys Ala Ala Glu Lys Glu 
335 340 345 

AAG AAG CCA CCC ACA GCC ACC ACC AAG GGG AGA GGA AAA GGA AAA GGC 1406 
Lys Lys Pro Pro Thr Ala Thr Thr Lys Gly Arg Gly Lys Gly Lys Gly 
350 355 360 
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AAG AAG AAA GGC AAA GTG AAA GAG GAA GTG GAG GAA GAG ACG GAC CCC 1454 
Lys Lys Lys Gly Lys Val Lys Glu Glu Val Glu Glu Glu Thr Asp Pro 
365 370 375 

CGG AAG CTT GAG CTG CTC AAC TGG GTG AAT GCC CTG GAG CAA GCC ATG 1502 
Arg Lys Leu Glu Leu Leu Asn Trp Val Asn Ala Leu Glu Gin Ala Met 
380 385 390 

CTG GAT GCT CTT GTC CTA GAT CGG GTG GAC TTT GTA AAG CTC CTG ATT 1550 
Leu Asp Ala Leu Val Leu Asp Arg Val Asp Phe Val Lys Leu Leu lie 
395 400 405 410 

GAA AAC GGA GTG AAC ATG CAG CAT TTC CTC ACC ATC CCG AGG CTG GAG 1598 
Glu Asn Gly Val Asn Met Gin His Phe Leu Thr lie Pro Arg Leu Glu 
415 420 425 

GAG CTA TAC AAC ACC AGA CTG GGC CCA CCA AAC ACC CTT CAT CTG CTG 1646 
Glu Leu Tyr Asn Thr Arg Leu Gly Pro Pro Asn Thr Leu His Leu Leu 
430 435 440 

GTG CGG GAT GTA AAG AAG AGC AAC CTT CCA CCT GAT TAC CAC ATC AGC 1694 

Val Arg Asp Val Lys Lys Ser Asn Leu Pro Pro Asp Tyr His lie Ser 

ti 445 450 455 

f/, CTC ATT GAT ATA GGA CTG GTG CTG GAG TAC CTC ATG GGC GGT GCC TAC 1742 

Jl; Leu He Asp He Gly Leu Val Leu Glu Tyr Leu Met Gly Gly Ala Tyr 

460 465 470 

Jz CGC TGC AAC TAC ACT CGG AAA AGC TTC CGG ACT CTC TAC AAC AAC TTG 1790 

7*- Arg Cys Asn Tyr Thr Arg Lys Ser Phe Arg Thr Leu Tyr Asn Asn Leu 

If" 475 480 485 490 

s TTT GGC CCT AAG AGG GTA GAG CTC AGC AGA CAC ACA GTG TCC TGT GCC 1838 

« Phe Gly Pro Lys Arg Val Glu Leu Ser Arg His Thr Val Ser Cys Ala 

. 495 500 505 

til TCC CAG AGT AAC ATG TGG TTC CTT GAT GTG CTT CCC CAA AAG CCC ACC 1886 

LJ Ser Gin Ser Asn Met Trp Phe Leu Asp Val Leu Pro Gin Lys Pro Thr 

^ 510 515 520 

TGT GCA GAA TGC AAC TCT TCA CCT CAC CTG TCC CAA ACT GAC ATC ACC 1934 
Cys Ala Glu Cys Asn Ser Ser Pro His Leu Ser Gin Thr Asp He Thr 
525 530 535 

CCA CCT CTG CCC T GACACCCAGT GCAGGGCCTC CTAGCTTTCA CATGCAGCCA 1987 
Pro Pro Leu Pro 
540 

TTCACATCGC CTCTCAAGAC TGGGCCAGGC AGTGCAACCT GTCAAGCATG TCTGTCCTCC 2047 

CCTCCTTCCT ACAATAGCCC CCCCTCTGGG CCCCATGCCT CTGCTCTCTC AGCCCGTTCT 2107 

CCTCCCCACT GATCACTGGC GCTCCTGTTG TCTTCCAAGG CAAGGAACAA GGAAAAGCAT 2167 

CTTTTTGCCC ACAAAAGTTT AGGGCTCCCC GCTGTTCAAC CATAGCCAAC CTCACTGTAC 2227 

ATCGGAGTCA TCCAGGCCAG CTGCCACACA CAAGCCTTCC CCACCCTATC CCAATAGACC 2287 



- 133 - 



CTATTCCTCC ATCAAAATCA AAGCTAACTC CTGGCCTGCC ACATTGCTTC TTCTTGCTCC 2347 

AGCCTGTTAA ACCTCCAATA AATGTCAGAT CTGTGGGAAG CCTTCCTCAC TCTCACTCCA 2407 

CAGTTTGTAC AGAGAGCGAG AGCCTCGTTT GGTTCTACTT ACAAGGAAGG CTTTGTGTCT 2467 

GTCTGTCCTT CCCAACTGAC TTCTGTTGAC AGAAGCAGTT TCCACATGAA AGCGTTGACT 2527 

CACCTGGATG TTGTCATTAA TTAATAGTGA TACAAAATAT TGACACTTCT TTTCCTGCTT 2587 

CTTTGTTATG CAGCCGAAAG CACTTAAGCT TCTGGGAATG GAAGTAAGTA GGACATGTTT 2647 

GTGGCAGTTT ATTTACTATA TATACCTTTG TCATTCTGTG GAAGCAAAAA TTGCAATGTT 2707 

TTCCATGAAT AAAGCTCGTG CC 2729 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 542 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

Met Tyr lie Arg Val Ser Tyr Asp Thr Lys Pro Asp Ser Leu Leu His 
15 10 15 

Leu Met Val Lys Asp Trp Gin Leu Glu Leu Pro Lys Leu Leu lie Ser 
20 25 30 

Val His Gly Gly Leu Gin Ser Phe Glu Met Gin Ser Lys Leu Lys Gin 
35 40 45 

Val Phe Gly Lys Gly Leu lie Lys Ala Ala Met Thr Thr Gly Ala Trp 
50 55 60 

lie Phe Thr Gly Gly Val Ser Thr Gly Val Val Ser His Val Gly Asp 
65 70 75 80 

Ala Leu Lys Asp His Ser Ser Lys Ser Arg Gly Arg Leu Cys Ala lie 
85 90 95 

Gly lie Ala Pro Trp Gly Met Val Glu Asn Lys Glu Asp Leu lie Gly 
100 105 110 

Lys Asp Val Thr Arg Val Tyr Gin Thr Met Ser Asn Pro Leu Ser Lys 
115 120 125 

Leu Ser Val Leu Asn Asn Ser His Thr His Phe lie Leu Ala Asp Asn 
130 135 140 

Gly Thr Leu Gly Lys Tyr Gly Ala Glu Val Lys Leu Arg Arg Gin Leu 
145 150 155 160 

Glu Lys His lie Ser Leu Gin Lys lie Asn Thr Arg Leu Gly Gin Gly 
165 170 175 
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Val Pro Val Val Gly Leu Val Val Glu 
180 185 

lie Val Leu Glu Tyr Leu Lys Glu Asp 
195 200 

Cya Asp Gly Ser Gly Arg Ala 
210 215 

Tyr Cys Asp Glu Gly Gly Val 
225 230 

Leu Val Thr lie Gin Lys Thr 
245 

Gin Leu Phe Ala He He Met Glu Cys 

260 265 

Thr Val Phe Arg Met Gly Ser Glu Gly 
275 280 

He Leu Thr Ala Leu Leu Lys Gly Thr 
290 295 

Leu Ser Leu Ala Leu Ala Trp Asn Arg 
305 310 

He Phe Val Phe Gly Pro His Trp Pro 
325 

Pro Val Asp Thr Lys Ala Ala Glu Lys 
340 345 

Thr Thr Lys Gly Arg Gly Lys Gly Lys 
355 360 

Lys Glu Glu Val Glu Glu Glu Thr Asp 
370 375 

Asn Trp Val Asn Ala Leu Glu Gin Ala 
385 390 



Gly Gly Pro Asn 



Pro Pro 
Ser Asp He Leu 
He Asn 
Phe Asn 



Glu Ser 
235 



Asp Arg Val Asp Phe Val Lys Leu Leu 
405 

Gin His Phe Leu Thr He Pro Arg Leu 
420 425 

Leu Gly Pro Pro Asn Thr Leu His Leu 
435 440 



Ser Asn Leu Pro Pro Asp Tyr 
450 455 

Val Leu Glu Tyr Leu Met Gly 
465 470 



Tyr Ser 
250 

Met Lys 

Gin Gin 

Asn Ala 

Val Asp 
315 

Pro Leu 
330 

Glu Lys 

Gly Lys 

Pro Arg 

Met Leu 
395 

He Glu 
410 

Glu Glu 
Leu Val 
Ser Leu 



Val Pro 
205 

Ser Phe 
220 

Leu Arg 
Lys Ser 
Lys Lys 



Val Val Ser 

190 

Val Val Val 



His He 
Gly Ala 

Lys Ser Phe Arg Thr Leu Tyr Asn Asn Leu Phe 



Asp Val 
285 

Ser Ala 

300 

He Ala 



Gly Ser 

Lys Pro 

Lys Lys 
365 

Lys Leu 
380 

Asp Ala 
Asn Gly 
Leu Tyr 



Arg Asp 
445 

He Asp 
460 



Ala His Lys 

Asp Gin Leu 
240 

Gin Ser Tyr 
255 

Glu Leu Val 
270 

Glu Met Ala 

Pro Asp Gin 

Arg Ser Gin 
320 

Leu Ala Pro 
335 

Pro Thr Ala 
350 

Gly Lys Val 

Glu Leu Leu 

Leu Val Leu 
400 

Val Asn Met 
415 

Asn Thr Arg 
430 

Val Lys Lys 
He Gly Leu 



Tyr Arg 
475 



Tyr Thr Arg 
480 



Cys Asn 
Gly Pro Lys Arg Val 
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485 



490 



495 



Glu Leu Ser Arg 
500 



His 



Thr 



Val Ser Cys Ala Ser Gin Ser Asn Met Trp 
505 510 



Phe Leu Asp Val 
515 



Leu 



Pro 



Gin Lys Pro Thr Cys Ala Glu Cys Asn Ser 
520 525 



Ser Pro His Leu 
530 



Ser 



Gin 



Thr Asp He Thr Pro Pro Leu Pro 
535 540 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 
GGGGAAGCAC ATCAAGGAAC 20 
(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
GCAACTACTA CACTCGGAAA AGC 23 
(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4944 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



<ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 346*. 4837 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:6: 
ACTCATTATA GGGNTCGAGC GGCCGCCCGG GCAGGTTTGA GCTGTGCCCT CTCCATTCCA 60 
CTGCTGTGGC AGGGTCAGAA ATCTTGGATA GAGAAAACCT TTTGCAAACG GGAATGTATC 120 
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TTTGTAATTC 


CTAGCACGAA 


AGACTCTAAC 


AGGTGTTGCT 


GTGGCCAGTT CACCAACCAG 


180 


CATATCCCCC 


CTCTGCCAAG 


TGCAACACCC 


AGCAAAAATG 


AAGAGGAAAG CAAACAGGTG 


240 


GAGACTCAGC 


CTGAGAAATG 


GTCTGTTGCC 


AAGCACACCC 


AGAGCTACCC AACAGATTCC 


300 


TATGGAGTTC 


TTGAATTCCA 


GGGTGGCGGA 


TATTCCAATA 


AAGCC ATG TAT ATC 


354 



Met Tyr He 

1 - 

CGT GTA TCC TAT GAC ACC AAG CCA GAC TCA CTG CTC CAT CTC ATG GTG 402 

Arg Val Ser Tyr Asp Thr Lys Pro Asp Ser Leu Leu His Leu Met Val 
5 10 15 

AAA GAT TGG CAG CTG GAA CTC CCC AAG CTC TTA ATA TCT GTG CAT GGA 450 
Lys Asp Trp Gin Leu Glu Leu Pro Lys Leu Leu He Ser Val His Gly 
20 25 30 35 

GGC CTC CAG AAC TTT GAG ATG CAG CCC AAG CTG AAA CAA GTC TTT GGG 498 
Gly Leu Gin Asn Phe Glu Met Gin Pro Lys Leu Lys Gin Val Phe Gly 
40 45 50 

AAA GGC CTG ATC AAG GCT GCT ATG ACC ACC GGG GCC TGG ATC TTC ACC 546 
Lys Gly Leu He Lys Ala Ala Met Thr Thr Gly Ala Trp He Phe Thr 
55 60 65 

GGG GGT GTC AGC ACA GGT GTT ATC AGC CAC GTA GGG GAT GCC TTG AAA 594 
Gly Gly Val Ser Thr Gly Val He Ser His Val Gly Asp Ala Leu Lys 
70 75 80 

GAC CAC TCC TCC AAG TCC AGA GGC CGG GTT TGT GCT ATA GGA ATT GCT 642 
Asp His Ser Ser Lys Ser Arg Gly Arg Val Cys Ala He Gly He Ala 
85 90 95 

CCA TGG GGC ATC GTG GAG AAT AAG GAA GAC CTG GTT GGA AAG GAT GTA 690 
Pro Trp Gly lie Val Glu Asn Lys Glu Asp Leu Val Gly Lys Asp Val 
100 105 110 115 

ACA AGA GTG TAC CAG ACC ATG TCC AAC CCT CTA AGT AAG CTC TCT GTG 738 
Thr Arg Val Tyr Gin Thr Met Ser Asn Pro Leu Ser Lys Leu Ser Val 
120 125 130 

CTC AAC AAC TCC CAC ACC CAC TTC ATC CTG GCT GAC AAT GGC ACC CTG 786 
Leu Asn Asn Ser His Thr His Phe He Leu Ala Asp Asn Gly Thr Leu 
135 140 145 

GGC AAG TAT GGC GCC GAG GTG AAG CTG CGA AGG CTG CTG GAA AAG CAC 834 
Gly Lys Tyr Gly Ala Glu Val Lys Leu Arg Arg Leu Leu Glu Lys His 
150 155 160 

ATC TCC CTC CAG AAG ATC AAC ACA AGA CTG GGG CAG GGC GTG CCC CTC 882 
He Ser Leu Gin Lys He Asn Thr Arg Leu Gly Gin Gly Val Pro Leu 
165 170 175 

GTG GGT CTC GTG GTG GAG GGG GGC CCT AAC GTG GTG TCC ATC GTC TTG 930 

Val Gly Leu Val Val Glu Gly Gly Pro Asn Val Val Ser He Val Leu 
180 185 190 195 
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GAA TAC CTG CAA GAA GAG CCT CCC ATC CCT GTG GTG ATT TGT GAT GGC 978 
Glu Tyr Leu Gin Glu Glu Pro Pro lie Pro Val Val lie Cys Asp Gly 
200 205 210 

AGC GGA CGT GCC TCG GAC ATC CTG TCC TTT GCG CAC AAG TAC TGT GAA 1026 
Ser Gly Arg Ala Ser Asp lie Leu Ser Phe Ala Hia Lys Tyr Cys Glu 
215 220 225 

GAA GGC GGA ATA ATA AAT GAG TCC CTC AGG GAG CAG CTT CTA GTT ACC 1074 
Glu Gly Gly lie lie Asn Glu Ser Leu Arg Glu Gin Leu Leu Val Thr 
230 235 240 

ATT CAG AAA ACA TTT AAT TAT AAT AAG GCA CAA TCA CAT CAG CTG TTT 1122 
lie Gin Lys Thr Phe Asn Tyr Asn Lys Ala Gin Ser His Gin Leu Phe 
245 250 255 

GCA ATT ATA ATG GAG TGC ATG AAG AAG AAA GAA CTC GTC ACT GTG TTC 1170 
Ala lie lie Met Glu Cys Met Lys Lys Lys Glu Leu Val Thr Val Phe 
260 265 270 275 

AGA ATG GGT TCT GAG GGC CAG CAG GAC ATC GAG ATG GCA ATT TTA ACT 1218 
Arg Met Gly Ser Glu Gly Gin Gin Asp lie Glu Met Ala lie Leu Thr 
280 285 290 

GCC CTG CTG AAA GGA ACA AAC GTA TCT GCT CCA GAT CAG CTG AGC TTG 1266 
Ala Leu Leu Lys Gly Thr Asn Val Ser Ala Pro Asp Gin Leu Ser Leu 
295 300 305 

GCA CTG GCT TGG AAC CGC GTG GAC ATA GCA CGA AGC CAG ATC TTT GTC 1314 
Ala Leu Ala Trp Asn Arg Val Asp lie Ala Arg Ser Gin lie Phe Val 
310 315 320 

TTT GCG CCC CAC TGG ACG CCC CTG GGA AGC CTG GCA CCC CCG ACG GAC 1362 
Phe Gly Pro His Trp Thr Pro Leu Gly Ser Leu Ala Pro Pro Thr Asp 
325 330 335 

AGC AAA GCC ACG GAG AAG GAG AAG AAG CCA CCC ATG GCC ACC ACC AAG 1410 
Ser Lys Ala Thr Glu Lys Glu Lys Lys Pro Pro Met Ala Thr Thr Lys 
340 345 350 355 

GGA GGA AGA GGA AAA GGG AAA GGC AAG AAG AAA GGG AAA GTG AAA GAG 1458 
Gly Gly Arg Gly Lys Gly Lys Gly Lys Lys Lys Gly Lys Val Lys Glu 
360 365 370 

GAA GTG GAG GAA GAA ACT GAC CCC CGG AAG ATA GAG CTG CTG AAC TGG 1506 
Glu Val Glu Glu Glu Thr Asp Pro Arg Lys lie Glu Leu Leu Asn Trp 
375 380 385 

GTG AAT GCT TTG GAG CAA GCG ATG CTA GAT GCT TTA GTC TTA GAT CGT 1554 
Val Asn Ala Leu Glu Gin Ala Met Leu Asp Ala Leu Val Leu Asp Arg 
390 395 400 

GTC GAC TTT GTG AAG CTC CTG ATT GAA AAC GGA GTG AAC ATG CAA CAC 1602 
Val Asp Phe Val Lys Leu Leu lie Glu Asn Gly Val Asn Met Gin His 
405 410 415 

TTT CTG ACC ATT CCG AGG CTG GAG GAG CTT TAT AAC ACA AGA CTG GGT 1650 
Phe Leu Thr lie Pro Arg Leu Glu Glu Leu Tyr Asn Thr Arg Leu Gly 
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420 



425 



430 



435 



CCA CCA AAC ACA CTT CAT CTG CTG GTG AGG GAT GTG AAA AAG AGC AAC 1698 
Pro Pro Asn Thr Leu His Leu Leu Val Arg Asp Val Lys Lys Ser Asn 
440 445 450 

CTT CCG CCT GAT TAC CAC ATC AGC CTC ATA GAC ATC GGG CTC GTG CTG 1746 
Leu Pro Pro Asp Tyr His lie Ser Leu lie Asp lie Gly Leu Val Leu 
455 460 465 

GAG TAC CTC ATG GGA GGA GCC TAC CGC TGC AAC TAC ACT CGG AAA AAC 1794 
Glu Tyr Leu Met Gly Gly Ala Tyr Arg Cys Asn Tyr Thr Arg Lys Asn 
470 475 480 

TTT CGG ACC CTT TAC AAC AAC TTG TTT GGA CCA AAG AGG CCT AAA GCT 1842 
Phe Arg Thr Leu Tyr Asn Asn Leu Phe Gly Pro Lys Arg Pro Lys Ala 
485 490 495 

CTT AAA CTT CTG GGA ATG GAA GAT GAT GAG CCT CCA GCT AAA GGG AAG 1890 
Leu Lys Leu Leu Gly Met Glu Asp Asp Glu Pro Pro Ala Lys Gly Lys 
500 505 510 515 

AAA AAA AAA AAA AAG AAA AAG GAG GAA GAG ATC GAC ATT GAT GTG GAC 1938 
Lys Lys Lys Lys Lys Lys Lys Glu Glu Glu lie Asp lie Asp Val Asp 
520 525 530 

GAC CCT GCC GTG AGT CGG TTC CAG TAT CCC TTC CAC GAG CTG ATG GTG 1986 
Asp Pro Ala Val Ser Arg Phe Gin Tyr Pro Phe His Glu Leu Met Val 
535 540 545 

TGG GCA GTG CTG ATG AAA CGC CAG AAA ATG GCA GTG TTC CTC TGG CAG 2034 
Trp Ala Val Leu Met Lys Arg Gin Lys Met Ala Val Phe Leu Trp Gin 
550 555 560 

CGA GGG GAA GAG AGC ATG GCC AAG GCC CTG GTG GCC TGC AAG CTC TAC 2082 
Arg Gly Glu Glu Ser Met Ala Lys Ala Leu Val Ala Cys Lys Leu Tyr 
565 570 575 

AAG GCC ATG GCC CAC GAG TCC TCC GAG AGT GAT CTG GTG GAT GAC ATC 2130 
Lys Ala Met Ala His Glu Ser Ser Glu Ser Asp Leu Val Asp Asp He 
580 585 590 595 

TCC CAG GAC TTG GAT AAC AAT TCC AAA GAC TTC GGC CAG CTT GCT TTG 2178 
Ser Gin Asp Leu Asp Asn Asn Ser Lys Asp Phe Gly Gin Leu Ala Leu 
600 605 610 

GAG TTA TTA GAC CAG TCC TAT AAG CAT GAC GAG CAG ATC GCT ATG AAA 2226 
Glu Leu Leu Asp Gin Ser Tyr Lys His Asp Glu Gin He Ala Met Lys 
615 620 625 

CTC CTG ACC TAC GAG CTG AAA AAC TGG AGC AAC TCG ACC TGC CTC AAA 2274 
Leu Leu Thr Tyr Glu Leu Lys Asn Trp Ser Asn Ser Thr Cys Leu Lys 
630 635 640 

CTG GCC GTG GCA GCC AAA CAC CGG GAC TTC ATT GCT CAC ACC TGC AGC 2322 
Leu Ala Val Ala Ala Lys His Arg Asp Phe He Ala His Thr Cys Ser 
645 650 655 
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CAO ATO CTO CTG ACC GAT ATG TGG ATG GGA AGA CTG CGG ATG CGG AAG 2370 
Gin Met Leu Leu Thr Asp Met Trp Met Gly Arg Leu Arg Met Arg Lys 
660 665 670 675 

AAC CCC GGC CTG AAG GTT ATC ATG GGG ATT CTT CTA CCC CCC ACC ATC 2418 
Asn Pro Gly Leu Lys Val lie Met Gly lie Leu Leu Pro Pro Thr lie 
680 685 690 

TTG TTT TTG GAA TTT CGC ACA TAT GAT GAT TTC TCG TAT CAA ACA TCC 2466 
Leu Phe Leu Glu Phe Arg Thr Tyr Asp Asp Phe Ser Tyr Gin Thr Ser 
695 700 705 

AAG GAA AAC GAG GAT GGC AAA GAA AAA GAA GAG GAA AAT ACG GAT GCA 2514 
Lys Glu Asn Glu Asp Gly Lys Glu Lys Glu Glu Glu Asn Thr Asp Ala 
710 715 720 

AAT GCA GAT GCT GGC TCA AGA AAG GGG GAT GAG GAG AAC GAG CAT AAA 2562 
Asn Ala Asp Ala Gly Ser Arg Lys Gly Asp Glu Glu Asn Glu His Lys 
725 730 735 

AAA CAG AGA ATT ATC CCC ATC GGA ACA AAA ATC TGT AAA TTC TAT AAC 2610 
Lys Gin Arg lie lie Pro lie Gly Thr Lys lie Cys Lys Phe Tyr Asn 
740 745 750 755 

GCG CCC ATT GTC AAG TTC TGG TTT TAC ACA ATA TCA TAC TTG GGC TAC 2658 
Ala Pro lie Val Lys Phe Trp Phe Tyr Thr lie Ser Tyr Leu Gly Tyr 
760 765 770 

CTG CTG CTG TTT AAC TAC GTC ATC CTG GTG CGG ATG GAT GGC TGG CCG 2706 
Leu Leu Leu Phe Asn Tyr Val lie Leu Val Arg Met Asp Gly Trp Pro 
775 780 785 

TCC CTC CAG GAG TGG ATC GTC ATC TCC TAC ATC GTG AGC CTG GCG TTA 2754 
Ser Leu Gin Glu Trp lie Val He Ser Tyr He Val Ser Leu Ala Leu 
790 795 800 

GAG AAG ATA CGA GAG ATC CTC ATG TCA GAA CCA GGC AAA CTC AGC CAG 2802 
Glu Lys He Arg Glu He Leu Met Ser Glu Pro Gly Lys Leu Ser Gin 
805 810 815 

AAA ATC AAA GTT TGG CTT CAG GAG TAC TGG AAC ATC ACA GAT CTC GTG 2850 
Lys He Lys Val Trp Leu Gin Glu Tyr Trp Asn He Thr Asp Leu Val 
820 825 830 835 

GCC ATT TCC ACA TTC ATG ATT GGA GCA ATG GCC ACG AGA TCT GTG ATG 2898 
Ala He Ser Thr Phe Met He Gly Ala Met Ala Thr Arg Ser Val Met 
840 845 850 

ATG ATT GGA AAG ATG ATG ATC GAC ATG CTG TAC TTT GTG GTC ATC ATG 2946 
Met He Gly Lys Met Met He Asp Met Leu Tyr Phe Val Val He Met 
855 860 865 

CTG GTC GTG CTC ATG AGT TTC GGA GTA GCC CGT CAA GCC ATT CTG CAT 2994 
Leu Val Val Leu Met Ser Phe Gly Val Ala Arg Gin Ala He Leu His 
870 875 880 

CCA GAG GAG AAG CCC TCT TGG AAA CTG GCC CGA AAC ATC TTC TAC ATG 3042 

Pro Glu Glu Lys Pro Ser Trp Lys Leu Ala Arg Asn He Phe Tyr Met 
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885 



890 



895 



CCC TAC TGG ATG ATC TAT GGA GAG GTG TTT GCA GAC CAG ATA GAC CTC 3090 
Pro Tyr Trp Mat lie Tyr Gly Glu Val Phe Ala Asp Gin lie Asp Leu 
900 905 910 915 

TAC GCC ATG GAA ATT AAT CCT CCT TGT GGT GAG AAC CTA TAT GAT GAG 3138 
Tyr Ala Met Glu lie Asn Pro Pro Cys Gly Glu Asn Leu Tyr Asp Glu 
920 925 930 



GAG GGC AAG CGG CTT CCT CCC TGT ATC CCC GGC GCC TGG CTC ACT CCA 
Glu Gly Lys Arg Leu Pro Pro Cys lie Pro Gly Ala Trp Leu Thr Pro 
935 940 945 



3186 



GCA CTC ATG GCG TGC TAT CTA CTG GTC GCC AAC ATC CTG CTG GTG AAC 3234 
Ala Leu Met Ala Cys Tyr Leu Leu Val Ala Asn lie Leu Leu Val Asn 
950 955 960 

CTG CTG ATT GCT GTG TTC AAC AAT ACC TTC TTT GAA GTA AAA TCA ATA 3282 
Leu Leu lie Ala Val Phe Asn Asn Thr Phe Phe Glu Val Lys Ser lie 
965 970 975 

TCC AAC CAG GTG TGG AAG TTC CAG CGA TAT CAG CTG ATT ATG ACA TTT 3330 
Ser Asn Gin Val Trp Lys Phe Gin Arg Tyr Gin Leu lie Met Thr Phe 
980 985 990 995 

CAT GAC AGG CCA GTC CTG CCC CCA CCG ATG ATC ATT TTA AGC CAC ATC 3378 
His Asp Arg Pro Val Leu Pro Pro Pro Met lie lie Leu Ser His lie 
1000 1005 1010 

TAC ATC ATC ATT ATG CGT CTC AGC GGC CGC TGC AGG AAA AAG AGA GAA 3426 
Tyr lie lie lie Met Arg Leu Ser Gly Arg Cys Arg Lys Lys Arg Glu 
1015 1020 1025 

GGG GAC CAA GAG GAA CGG GAT CGT GGA TTG AAG CTC TTC CTT AGC GAC 3474 
Gly Asp Gin Glu Glu Arg Asp Arg Gly Leu Lys Leu Phe Leu Ser Asp 
1030 1035 1040 

GAG GAG CTA AAG AGG CTG CAT GAG TTC GAG GAG CAG TGC GTG CAG GAG 3522 
Glu Glu Leu Lys Arg Leu His Glu Phe Glu Glu Gin Cys Val Gin Glu 
1045 1050 1055 

CAC TTC CGG GAG AAG GAG GAT GAG CAG CAG TCG TCC AGC GAC GAG CGC 3570 
His Phe Arg Glu Lys Glu Asp Glu Gin Gin Ser Ser Ser Asp Glu Arg 
1060 1065 1070 1075 

ATC CGG GTC ACT TCT GAA AGA GTT GAA AAT ATG TCA ATG AGG TTG GAA 3618 
lie Arg Val Thr Ser Glu Arg Val Glu Asn Met Ser Met Arg Leu Glu 
1080 1085 1090 

GAA ATC AAT GAA AGA GAA ACT TTT ATG AAA ACT TCC CTG CAG ACT GTT 3666 
Glu lie Asn Glu Arg Glu Thr Phe Met Lys Thr Ser Leu Gin Thr Val 
1095 1100 1105 

GAC CTT CGA CTT GCT CAG CTA GAA GAA TTA TCT AAC AGA ATG GTG AAT 3714 
Asp Leu Arg Leu Ala Gin Leu Glu Glu Leu Ser Asn Arg Met Val Asn 
1110 1115 1120 
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GCT CTT GAA AAT CTT GCG GGA ATC GAC AGG TCT GAC CTG ATC CAG GCA 
Ala Leu Glu Aan Leu Ala Gly lie Asp Arg Ser Asp Leu He Gin Ala 
1125 1130 1135 



3762 



CGG TCC CGG GCT TCT TCT GAA TGT GAG GCA ACG TAT CTT CTC CGG CAA 
Arg Ser Arg Ala Ser Ser Glu Cys Glu Ala Thr Tyr Leu Leu Arg Gin 
1140 1145 1150 1155 



3810 



AGC AGC ATC AAT AGC GCT GAT GGC TAC AGC TTG TAT CCA TAT CAT TTT 
Ser Ser He Asn Ser Ala Asp Gly Tyr Ser Leu Tyr Arg Tyr His Phe 
1160 1165 1170 



3858 



AAC GGA GAA GAG TTA TTA TTT GAG GAT ACA TCT CTC TCC ACG TCA CCA 
Asn Gly Glu Glu Leu Leu Phe Glu Asp Thr Ser Leu Ser Thr Ser Pro 
1175 1180 1185 



3906 



GGG ACA GGA GTC AGG AAA AAA ACC TGT TCC TTC CGT ATA AAG GAA GAG 
Gly Thr Gly Val Arg Lys Lys Thr Cys Ser Phe Arg He Lys Glu Glu 
1190 1195 1200 



3954 



AAG GAC GTG AAA ACG CAC CTA GTC CCA GAA TGT CAG AAC ACT CTT CAC 
Lys Asp Val Lys Thr His Leu Val Pro Glu Cys Gin Asn Ser Leu His 
1205 1210 1215 



4002 



CTT TCA CTG GGC ACA AGC ACA TCA GCA ACC CCA GAT GGC AGT CAC CTT 
Leu Ser Leu Gly Thr Ser Thr Ser Ala Thr Pro Asp Gly Ser His Leu 
1220 1225 1230 1235 



4050 



GCA GTA GAT GAC TTA AAG AAC GCT GAA GAG TCA AAA TTA GGT CCA GAT 
Ala Val Asp Asp Leu Lys Asn Ala Glu Glu Ser Lys Leu Gly Pro Asp 
1240 1245 1250 



4098 



ATT GGG ATT TCA AAG GAA GAT GAT GAA AGA CAG ACA GAC TCT AAA AAA 
He Gly He Ser Lys Glu Asp Asp Glu Arg Gin Thr Asp Ser Lys Lys 
1255 1260 1265 



4146 



GAA GAA ACT ATT TCC CCA AGT TTA AAT AAA ACA GAT GTG ATA CAT GGA 
Glu Glu Thr He Ser Pro Ser Leu Asn Lys Thr Asp Val He His Gly 
1270 1275 1280 



4194 



CAG GAC AAA TCA GAT GTT CAA AAC ACT CAG CTA ACA GTG GAA ACG ACA 
Gin Asp Lys Ser Asp Val Gin Asn Thr Gin Leu Thr Val Glu Thr Thr 
1285 1290 1295 



4242 



AAT ATA GAA GGC ACT ATT TCC TAT CCC CTG GAA GAA ACC AAA ATT ACA 
Asn He Glu Gly Thr He Ser Tyr Pro Leu Glu Glu Thr Lys He Thr 
1300 1305 1310 1315 



4290 



CGC TAT TTC CCC GAT GAA ACG ATC AAT GCT TGT AAA ACA ATG AAG TCC 
Arg Tyr Phe Pro Asp Glu Thr He Asn Ala Cys Lys Thr Met Lys Ser 
1320 1325 1330 



4338 



AGA AGC TTC GTC TAT TCC CGG GGA AGA AAG CTG GTC GGT GGG GTT AAC 
Arg Ser Phe Val Tyr Ser Arg Gly Arg Lys Leu Val Gly Gly Val Asn 
1335 1340 1345 



4386 



CAG GAT GTA GAG TAC AGT TCA ATC ACG GAC CAG CAA TTG ACG ACG GAA 
Gin Asp Val Glu Tyr Ser Ser He Thr Asp Gin Gin Leu Thr Thr Glu 



4434 
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1350 1355 1360 



4482 



4530 



4578 



4626 



4674 



4722 



4770 



4818 



TCC ACA GAA ACT GAA TGC T AGTCTGTTTT GTTTCTTTAA TTTTTTTTTT 4867 
Ser Thr Glu Thr Glu Cys 
1495 

TAACAGTCAG AACCACTAAT GGGTGTCATC TTGGCCATCC TAAACATCCA TCCAATTTCC 4927 
TAAAAACATT TTCCCTT 4944 
(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1497 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

Met Tyr He Arg Val Ser Tyr Asp Thr Lys Pro Asp Ser Leu Leu His 
15 10 15 

Leu Met Val Lys Asp Trp Gin Leu Glu Leu Pro Lys Leu Leu He Ser 
20 25 30 

Val His Gly Gly Leu Gin Asn Phe Glu Met Gin Pro Lys Leu Lys Gin 



TGG CAA TGC 
Trp Gin Cys 
1365 


CAA 
Gin 


GTT CAA AAG ATC 
Val Gin Lys He 

1370 


ACG 


CGC 
Arg 


TCT 
Ser 


CAT AGC 
His Ser 
1375 


ACA 

inr 


GAT ATT 
Asp He 


CCT TAC 
Pro Tyr 
1380 


ATT 
He 


GTG 
Val 


TCG 
Ser 


GAA GCT 
Glu Ala 
1385 


GCA 
Ala 


GTG 
Val 


CAA 
Gin 


GCT GAG 
Ala Glu 
1390 


CAA 
Gin 


AAA 
Lys 


GAG 
Glu 


CAG 
Gin 
1395 


TTT 
Phe 


GCA 
Ala 


GAT 
Asp 


ATG 
Met 


CAA GAT 
Gin Asp 
1400 


GAA 
Glu 


CAC 
His 


CAT 
His 


GTC GCT 
Val Ala 
1405 


GAA 
Glu 


GCA 
Ala 


ATT 
He 


CCT CGA 
Pro Arg 
1410 


ATC 
He 


CCT 
Pro 


CGC 
Arg 


TTG TCC 
Leu Ser 
1415 


CTA 
Leu 


ACC 
Thr 


ATT 
He 


ACT GAC 
Thr Asp 
1420 


AGA 
Arg 


AAT 
Asn 


GGG 
Gly 


ATG GAA 
Met Glu 
1425 


AAC 
Asn 


TTA 
Leu 


CTG 
Leu 


TCT GTG 
Ser Val 
1430 


AAG 
Lys 


CCA 
Pro 


GAT 
Asp 


CAA ACT 
Gin Thr 
1435 


TTG 
Leu 


GGA 
Gly 


TTC 
Phe 


CCA TCT 
Pro Ser 
1440 


CTC 
Leu 


AGG 
Arg 


TCA 
Ser 


AAA AGT 
Lys Ser 
1445 


TTA 
Leu 


CAT 
His 


GGA 
Gly 


CAT CCT 
His Pro 
1450 


AGG 
Arg 


AAT 
Asn 


GTG 
Val 


AAA TCC 
Lys Ser 
1455 


ATT 
He 


CAG 
Gin 


GGA 
Gly 


AAG TTA GAC 
Lys Leu Asp 
1460 


AGA 
Arg 


TCT 
Ser 


GGA CAT 
Gly His 
1465 


GCC 
Ala 


AGT 
Ser 


AGT 
Ser 


GTA AGC 
Val Ser 
1470 


AGC 
Ser 


TTA 
Leu 


GTA 
Val 


ATT 
He 
1475 


GTG 
Val 


TCT GGA 
Ser Gly 


ATG 
Met 


ACA GCA 

Thr Ala 
1480 


GAA 
Glu 


GAA 
Glu 


AAA 
Lys 


AAG GTT 
Lys Val 
1485 


AAG 
Lys 


AAA 
Lys 


GAG 
Glu 


AAA GCT 
Lys Ala 
1490 
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35 40 45 

Val Phe Gly Lya Gly Leu He Lya Ala Ala Met Thr Thr Gly Ala Trp 
50 55 60 

He Phe Thr Gly Gly Val Ser Thr Gly Val He Ser His Val Gly Asp 
65 70 75 80 

Ala Leu Lys Asp His Ser Ser Lys Ser Arg Gly Arg Val Cys Ala He 
85 90 95 

Gly He Ala Pro Trp Gly He Val Glu Asn Lys Glu Asp Leu Val Gly 
100 105 110 

Lys Asp Val Thr Arg Val Tyr Gin Thr Met Ser Asn Pro Leu Ser Lys 
115 120 125 

Leu Ser Val Leu Asn Asn Ser His Thr His Phe He Leu Ala Asp Asn 
130 135 140 

Gly Thr Leu Gly Lys Tyr Gly Ala Glu Val Lys Leu Arg Arg Leu Leu 
145 150 155 160 

Glu Lys His He Ser Leu Gin Lys He Asn Thr Arg Leu Gly Gin Gly 
165 170 175 

Val Pro Leu Val Gly Leu Val Val Glu Gly Gly Pro Asn Val Val Ser 
180 185 190 

He Val Leu Glu Tyr Leu Gin Glu Glu Pro Pro He Pro Val Val He 
195 200 205 

Cys Asp Gly Ser Gly Arg Ala Ser Asp He Leu Ser Phe Ala His Lys 
210 215 220 

Tyr Cys Glu Glu Gly Gly He He Asn Glu Ser Leu Arg Glu Gin Leu 
225 230 235 240 

Leu Val Thr He Gin Lys Thr Phe Asn Tyr Asn Lys Ala Gin Ser His 
245 250 255 

Gin Leu Phe Ala He He Met Glu Cys Met Lys Lys Lys Glu Leu Val 
260 265 270 

Thr Val Phe Arg Met Gly Ser Glu Gly Gin Gin Asp He Glu Met Ala 
275 280 285 

He Leu Thr Ala Leu Leu Lys Gly Thr Asn Val Ser Ala Pro Asp Gin 
290 295 300 

Leu Ser Leu Ala Leu Ala Trp Asn Arg Val Asp He Ala Arg Ser Gin 
305 310 315 320 

He Phe Val Phe Gly Pro His Trp Thr Pro Leu Gly Ser Leu Ala Pro 
325 330 335 

Pro Thr Asp Ser Lys Ala Thr Glu Lys Glu Lys Lys Pro Pro Met Ala 
340 345 350 
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Thr Thr Lys Oly Gly Arg Gly Lys Gly Lys Gly Lys Lys Lys Gly Lys 
355 360 365 

Val Lys Glu Glu Val Glu Glu Glu Thr Asp Pro Arg Lys lie Glu Leu 
370 375 380 

Leu Asn Trp Val Asn Ala Leu Glu Gin Ala Met Leu Asp Ala Leu Val 
385 390 395 400 

Leu Asp Arg Val Asp Phe Val Lys Leu Leu lie Glu Asn Gly Val Asn 
405 410 415 

Met Gin His Phe Leu Thr lie Pro Arg Leu Glu Glu Leu Tyr Asn Thr 
420 425 430 

Arg Leu Gly Pro Pro Asn Thr Leu His Leu Leu Val Arg Asp Val Lys 
435 440 445 

Lys Ser Asn Leu Pro Pro Asp Tyr His He Ser Leu He Asp He Gly 
450 455 460 



Leu Val Leu Glu Tyr Leu Met Gly Gly Ala Tyr Arg Cys Asn Tyr Thr 
465 470 475 480 

Arg Lys Asn Phe Arg Thr Leu Tyr Asn Asn Leu Phe Gly Pro Lys Arg 
485 490 495 

Pro Lys Ala Leu Lys Leu Leu Gly Met Glu Asp Asp Glu Pro Pro Ala 
500 505 510 

Lys Gly Lys Lys Lys Lys Lys Lys Lys Lys Glu Glu Glu He Asp He 
515 520 525 

Asp Val Asp Asp Pro Ala Val Ser Arg Phe Gin Tyr Pro Phe His Glu 
530 535 540 

Leu Met Val Trp Ala Val Leu Met Lys Arg Gin Lys Met Ala Val Phe 
545 550 555 560 

Leu Trp Gin Arg Gly Glu Glu Ser Met Ala Lys Ala Leu Val Ala Cys 
565 570 575 

Lys Leu Tyr Lys Ala Met Ala His Glu Ser Ser Glu Ser Asp Leu Val 
580 585 590 

Asp Asp He Ser Gin Asp Leu Asp Asn Asn Ser Lys Asp Phe Gly Gin 
595 600 605 

Leu Ala Leu Glu Leu Leu Asp Gin Ser Tyr Lys His Asp Glu Gin He 
610 615 620 

Ala Met Lys Leu Leu Thr Tyr Glu Leu Lys Asn Trp Ser Asn Ser Thr 
625 630 635 640 

Cys Leu Lys Leu Ala Val Ala Ala Lys His Arg Asp Phe He Ala His 
645 650 655 
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Thr Cys Ser Gin Met Leu Leu Thr Asp Met Trp Met Gly Arg Leu Arg 
660 665 670 

Met Arg Lys Asn Pro Gly Leu Lys Val lie Met Gly lie Leu Leu Pro 
675 680 685 

Pro Thr lie Leu Phe Leu Glu Phe Arg Thr Tyr Asp Asp Phe Ser Tyr 
690 695 700 

Gin Thr Ser Lys Glu Asn Glu Asp Gly Lys Glu Lys Glu Glu Glu Asn 
705 710 715 720 

Thr Asp Ala Asn Ala Asp Ala Gly Ser Arg Lys Gly Asp Glu Glu Asn 
725 730 735 

Glu His Lys Lys Gin Arg lie lie Pro lie Gly Thr Lys lie Cys Lys 
740 745 750 

Phe Tyr Asn Ala Pro lie Val Lys Phe Trp Phe Tyr Thr lie Ser Tyr 
755 760 765 

Leu Gly Tyr Leu Leu Leu Phe Asn Tyr Val lie Leu Val Arg Met Asp 
770 775 780 

Gly Trp Pro Ser Leu Gin Glu Trp lie Val lie Ser Tyr lie Val Ser 
785 790 795 800 



Leu Ala Leu Glu Lys lie Arg Glu lie Leu Met Ser Glu Pro Gly Lys 
805 810 815 

Leu Ser Gin Lys lie Lys Val Trp Leu Gin Glu Tyr Trp Asn lie Thr 
820 825 830 

Asp Leu Val Ala lie Ser Thr Phe Met lie Gly Ala Met Ala Thr Arg 
835 840 845 

Ser Val Met Met He Gly Lys Met Met lie Asp Met Leu Tyr Phe Val 
850 855 860 

Val He Met Leu Val Val Leu Met Ser Phe Gly Val Ala Arg Gin Ala 
865 870 875 880 

He Leu His Pro Glu Glu Lys Pro Ser Trp Lys Leu Ala Arg Asn lie 
885 890 895 

Phe Tyr Met Pro Tyr Trp Met He Tyr Gly Glu Val Phe Ala Asp Gin 
900 905 910 

He Asp Leu Tyr Ala Met Glu He Asn Pro Pro Cys Gly Glu Asn Leu 
915 920 925 

Tyr Asp Glu Glu Gly Lys Arg Leu Pro Pro Cys He Pro Gly Ala Trp 
930 935 940 

Leu Thr Pro Ala Leu Met Ala Cys Tyr Leu Leu Val Ala Asn He Leu 
945 950 955 960 
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Leu Val Asn Leu Leu lie Ala Val Phe Asn Asn Thr Phe Phe Glu Val 
965 970 975 

Lys Ser lie Ser Asn Gin Val Trp Lys Phe Gin Arg Tyr Gin Leu lie 

980 985 990 

Met Thr Phe His Asp Arg Pro Val Leu Pro Pro Pro Met lie lie Leu 
995 1000 1005 

Ser His lie Tyr lie lie lie Met Arg Leu Ser Gly Arg Cys Arg Lys 
1010 1015 1020 

Lys Arg Glu Gly Asp Gin Glu Glu Arg Asp Arg Gly Leu Lys Leu Phe 
1025 1030 1035 1040 

Leu Ser Asp Glu Glu Leu Lys Arg Leu His Glu Phe Glu Glu Gin Cys 
1045 1050 1055 

Val Gin Glu His Phe Arg Glu Lys Glu Asp Glu Gin Gin Ser Ser Ser 
1060 1065 1070 

Asp Glu Arg lie Arg Val Thr Ser Glu Arg Val Glu Asn Met Ser Met 
1075 1080 1085 

Arg Leu Glu Glu lie Asn Glu Arg Glu Thr Phe Met Lys Thr Ser Leu 
1090 1095 1100 

Gin Thr Val Asp Leu Arg Leu Ala Gin Leu Glu Glu Leu Ser Asn Arg 
1105 1110 1115 1120 

Met Val Asn Ala Leu Glu Asn Leu Ala Gly lie Asp Arg Ser Asp Leu 
1125 1130 1135 

lie Gin Ala Arg Ser Arg Ala Ser Ser Glu Cys Glu Ala Thr Tyr Leu 
1140 1145 1150 

Leu Arg Gin Ser Ser lie Asn Ser Ala Asp Gly Tyr Ser Leu Tyr Arg 
1155 1160 1165 

Tyr His Phe Asn Gly Glu Glu Leu Leu Phe Glu Asp Thr Ser Leu Ser 
1170 1175 1180 

Thr Ser Pro Gly Thr Gly Val Arg Lys Lys Thr Cys Ser Phe Arg lie 
1185 1190 1195 1200 

Lys Glu Glu Lys Asp Val Lys Thr His Leu Val Pro Glu Cys Gin Asn 
1205 1210 1215 

Ser Leu His Leu Ser Leu Gly Thr Ser Thr Ser Ala Thr Pro Asp Gly 
1220 1225 1230 

Ser His Leu Ala Val Asp Asp Leu Lys Asn Ala Glu Glu Ser Lys Leu 
1235 1240 1245 

Gly Pro Asp lie Gly lie Ser Lys Glu Asp Asp Glu Arg Gin Thr Asp 
1250 1255 1260 
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Ser Lys Lye Glu Glu Thr He Ser Pro Ser Leu Asn Lys Thr Asp Val 
1265 1270 1275 1280 

He His Gly Gin Asp Lys Ser Asp Val Gin Asn Thr Gin Leu Thr Val 
1285 1290 1295 

Glu Thr Thr Asn He Glu Gly Thr He Ser Tyr Pro Leu Glu Glu Thr 
1300 1305 1310 

Lys He Thr Arg Tyr Phe Pro Asp Glu Thr He Asn Ala Cys Lys Thr 
1315 1320 1325 

Met Lys Ser Arg Ser Phe Val Tyr Ser Arg Gly Arg Lys Leu Val Gly 
1330 1335 1340 

Gly Val Asn Gin Asp Val Glu Tyr Ser Ser He Thr Asp Gin Gin Leu 
1345 1350 1355 1360 

Thr Thr Glu Trp Gin Cys Gin Val Gin Lys He Thr Arg Ser His Ser 
1365 1370 1375 

Thr Asp He Pro Tyr He Val Ser Glu Ala Ala Val Gin Ala Glu Gin 
1380 1385 1390 

Lys Glu Gin Phe Ala Asp Met Gin Asp Glu His His Val Ala Glu Ala 
1395 1400 1405 

He Pro Arg He Pro Arg Leu Ser Leu Thr He Thr Asp Arg Asn Gly 
1410 1415 1420 

Met Glu Asn Leu Leu Ser Val Lys Pro Asp Gin Thr Leu Gly Phe Pro 
1425 1430 1435 1440 

Ser Leu Arg Ser Lys Ser Leu His Gly His Pro Arg Asn Val Lys Ser 
1445 1450 1455 

He Gin Gly Lys Leu Asp Arg Ser Gly His Ala Ser Ser Val Ser Ser 
1460 1465 1470 

Leu Val He Val Ser Gly Met Thr Ala Glu Glu Lys Lys Val Lys Lys 
1475 1480 1485 

Glu Lys Ala Ser Thr Glu Thr Glu Cys 
1490 1495 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5055 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 
<lx) FEATURE : 
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(A) NAME /KEY: CDS 

(B) LOCATION: 346.. 4945 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

ACTCATTATA GGGNTCGAGC GGCCGCCCGG GCAGGTTTGA GCTGTGCCCT CTCCATTCCA 60 

CTGCTGTGGC AGGGTCAGAA ATCTTGGATA GAGAAAACCT TTTGCAAACG GGAATGTATC 120 

TTTGTAATTC CTAGCACGAA AGACTCTAAC AGGTGTTGCT GTGGCCAGTT CACCAACCAG 180 

CATATCCCCC CTCTGCCAAG TGCAACACCC AGCAAAAATG AAGAGGAAAG CAAACAGGTG 240 

GAGACTCAGC CTGAGAAATG GTCTGTTGCC AAGCACACCC AGAGCTACCC AACAGATTCC 300 

TATGGAGTTC TTGAATTCCA GGGTGGCGGA TATTCCAATA AAGCC ATG TAT ATC 354 

Met Tyr He 
1 

CGT GTA TCC TAT GAC ACC AAG CCA GAC TCA CTG CTC CAT CTC ATG GTG 402 
Arg Val Ser Tyr Asp Thr Lys Pro Asp Ser Leu Leu His Leu Met Val 
5 10 15 

AAA GAT TGG CAG CTG GAA CTC CCC AAG CTC TTA ATA TCT GTG CAT GGA 450 
Lys Asp Trp Gin Leu Glu Leu Pro Lys Leu Leu He Ser Val His Gly 
20 25 30 35 

GGC CTC CAG AAC TTT GAG ATG CAG CCC AAG CTG AAA CAA GTC TTT GGG 498 
Gly Leu Gin Asn Phe Glu Met Gin Pro Lys Leu Lys Gin Val Phe Gly 
40 45 50 

AAA GGC CTG ATC AAG GCT GCT ATG ACC ACC GGG GCC TGG ATC TTC ACC 546 
Lys Gly Leu He Lys Ala Ala Met Thr Thr Gly Ala Trp He Phe Thr 
55 60 65 

GGG GGT GTC AGC ACA GGT GTT ATC AGC CAC GTA GGG GAT GCC TTG AAA 594 
Gly Gly Val Ser Thr Gly Val He Ser His Val Gly Asp Ala Leu Lys 
70 75 80 

GAC CAC TCC TCC AAG TCC AGA GGC CGG GTT TGT GCT ATA GGA ATT GCT 642 
Asp His Ser Ser Lys Ser Arg Gly Arg Val Cys Ala He Gly He Ala 
85 90 95 

CCA TGG GGC ATC GTG GAG AAT AAG GAA GAC CTG GTT GGA AAG GAT GTA 690 
Pro Trp Gly He Val Glu Asn Lys Glu Asp Leu Val Gly Lys Asp Val 
100 105 110 115 

ACA AGA GTG TAC CAG ACC ATG TCC AAC CCT CTA AGT AAG CTC TCT GTG 738 
Thr Arg Val Tyr Gin Thr Met Ser Asn Pro Leu Ser Lys Leu Ser Val 
120 125 130 

CTC AAC AAC TCC CAC ACC CAC TTC ATC CTG GCT GAC AAT GGC ACC CTG 786 
Leu Asn Asn Ser His Thr His Phe He Leu Ala Asp Asn Gly Thr Leu 
135 140 145 

GGC AAG TAT GGC GCC GAG GTG AAG CTG CGA AGG CTG CTG GAA AAG CAC 834 
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Gly Lys Tyr Gly Ala Glu Val Lya Leu Arg Arg Leu Leu Glu Lys His 

150 155 160 

ATC TCC CTC CAG AAG ATC AAC ACA AGA CTG GGG CAG GGC GTG CCC CTC 882 
lie Ser Leu Gin Lys He Asn Thr Arg Leu Gly Gin Gly Val Pro Leu 
165 170 175 

GTG GGT CTC GTG GTG GAG GGG GGC CCT AAC GTG GTG TCC ATC GTC TTG 930 
Val Gly Leu Val Val Glu Gly Gly Pro Asn Val Val Ser He Val Leu 
180 185 190 195 

GAA TAC CTG CAA GAA GAG CCT CCC ATC CCT GTG GTG ATT TGT GAT GGC 978 
Glu Tyr Leu Gin Glu Glu Pro Pro He Pro Val Val lie Cys Asp Gly 
200 205 210 

AGC GGA CGT GCC TCG GAC ATC CTG TCC TTT GCG CAC AAG TAC TGT GAA 1026 
Ser Gly Arg Ala Ser Asp He Leu Ser Phe Ala His Lys Tyr Cys Glu 
215 220 225 

GAA GGC GGA ATA ATA AAT GAG TCC CTC AGG GAG CAG CTT CTA GTT ACC 1074 
Glu Gly Gly He He Asn Glu Ser Leu Arg Glu Gin Leu Leu Val Thr 
230 235 240 

ATT CAG AAA ACA TTT AAT TAT AAT AAG GCA CAA TCA CAT CAG CTG TTT 1122 
He Gin Lys Thr Phe Asn Tyr Asn Lys Ala Gin Ser His Gin Leu Phe 
245 250 255 

GCA ATT ATA ATG GAG TGC ATG AAG AAG AAA GAA CTC GTC ACT GTG TTC 1170 
Ala He He Met Glu Cys Met Lys Lys Lys Glu Leu Val Thr Val Phe 
260 265 270 275 

AGA ATG GGT TCT GAG GGC CAG CAG GAC ATC GAG ATG GCA ATT TTA ACT 1218 
Arg Met Gly Ser Glu Gly Gin Gin Asp He Glu Met Ala He Leu Thr 
280 285 290 

GCC CTG CTG AAA GGA ACA AAC GTA TCT GCT CCA GAT CAG CTG AGC TTG 1266 
Ala Leu Leu Lys Gly Thr Asn Val Ser Ala Pro Asp Gin Leu Ser Leu 
295 300 305 

GCA CTG GCT TGG AAC CGC GTG GAC ATA GCA CGA AGC CAG ATC TTT GTC 1314 
Ala Leu Ala Trp Asn Arg Val Asp He Ala Arg Ser Gin He Phe Val 
310 315 320 

TTT GGG CCC CAC TGG ACG CCC CTG GGA AGC CTG GCA CCC CCG ACG GAC 1362 
Phe Gly Pro His Trp Thr Pro Leu Gly Ser Leu Ala Pro Pro Thr Asp 
325 330 335 

AGC AAA GCC ACG GAG AAG GAG AAG AAG CCA CCC ATG GCC ACC ACC AAG 1410 
Ser Lys Ala Thr Glu Lys Glu Lys Lys Pro Pro Met Ala Thr Thr Lys 
340 345 350 355 

GGA GGA AGA GGA AAA GGG AAA GGC AAG AAG AAA GGG AAA GTG AAA GAG 1458 
Gly Gly Arg Gly Lys Gly Lys Gly Lys Lys Lys Gly Lys Val Lys Glu 
360 365 370 

GAA GTG GAG GAA GAA ACT GAC CCC CGG AAG ATA GAG CTG CTG AAC TGG 1506 
Glu Val Glu Glu Glu Thr Asp Pro Arg Lys He Glu Leu Leu Asn Trp 
375 380 385 
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GTG AAT GCT TTO GAG CAA GCG ATG CTA GAT GCT TTA GTC TTA GAT CGT 1554 
Val Asn Ala Leu Glu Gin Ala Met Leu Asp Ala Leu Val Leu Asp Arg 
390 395 400 

GTC GAC TTT GTG AAG CTC CTG ATT GAA AAC GGA GTG AAC ATG CAA CAC 1602 
Val Asp Phe Val Lys Leu Leu lie Glu Asn Gly Val Asn Met Gin His 
405 410 415 

TTT CTG ACC ATT CCG AGG CTG GAG GAG CTT TAT AAC ACA AGA CTG GGT 1650 
Phe Leu Thr lie Pro Arg Leu Glu Glu Leu Tyr Asn Thr Arg Leu Gly 
420 425 430 435 

CCA CCA AAC ACA CTT CAT CTG CTG GTG AGG GAT GTG AAA AAG AGC AAC 1698 
Pro Pro Asn Thr Leu His Leu Leu Val Arg Asp Val Lys Lys Ser Asn 
440 445 450 

CTT CCG CCT GAT TAC CAC ATC AGC CTC ATA GAC ATC GGG CTC GTG CTG 1746 
Leu Pro Pro Asp Tyr His He Ser Leu He Asp He Gly Leu Val Leu 
455 460 465 

GAG TAC CTC ATG GGA GGA GCC TAC CGC TGC AAC TAC ACT CGG AAA AAC 1794 
Glu Tyr Leu Met Gly Gly Ala Tyr Arg Cys Asn Tyr Thr Arg Lys Asn 
470 475 480 

TTT CGG ACC CTT TAC AAC AAC TTG TTT GGA CCA AAG AGG CCT AAA GCT 1842 
Phe Arg Thr Leu Tyr Asn Asn Leu Phe Gly Pro Lys Arg Pro Lys Ala 
485 490 495 

CTT AAA CTT CTG GGA ATG GAA GAT GAT GAG CCT CCA GCT AAA GGG AAG 1890 
Leu Lys Leu Leu Gly Met Glu Asp Asp Glu Pro Pro Ala Lys Gly Lys 
500 505 510 515 

AAA AAA AAA AAA AAG AAA AAG GAG GAA GAG ATC GAC ATT GAT GTG GAC 1938 
Lys Lys Lys Lys Lys Lys Lys Glu Glu Glu He Asp He Asp Val Asp 
520 525 530 

GAC CCT GCC GTG AGT CGG TTC CAG TAT CCC TTC CAC GAG CTG ATG GTG 1986 
Asp Pro Ala Val Ser Arg Phe Gin Tyr Pro Phe His Glu Leu Met Val 
535 540 545 

TGG GCA GTG CTG ATG AAA CGC CAG AAA ATG GCA GTG TTC CTC TGG CAG 2034 
Trp Ala Val Leu Met Lys Arg Gin Lys Met Ala Val Phe Leu Trp Gin 
550 555 560 

CGA GGG GAA GAG AGC ATG GCC AAG GCC CTG GTG GCC TGC AAG CTC TAC 2082 
Arg Gly Glu Glu Ser Met Ala Lys Ala Leu Val Ala Cys Lys Leu Tyr 
565 570 575 

AAG GCC ATG GCC CAC GAG TCC TCC GAG AGT GAT CTG GTG GAT GAC ATC 2130 
Lys Ala Met Ala His Glu Ser Ser Glu Ser Asp Leu Val Asp Asp He 
580 585 590 595 

TCC CAG GAC TTG GAT AAC AAT TCC AAA GAC TTC GGC CAG CTT GCT TTG 2178 
Ser Gin Asp Leu Asp Asn Asn Ser Lys Asp Phe Gly Gin Leu Ala Leu 
600 605 610 

GAG TTA TTA GAC CAG TCC TAT AAG CAT GAC GAG CAG ATC GCT ATG AAA 2226 

Glu Leu Leu Asp Gin ser Tyr Lys His Asp Glu Gin He Ala Met Lys 
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615 



620 



625 



CTC CTG ACC TAC GAG CTG AAA AAC TGG AGC AAC TCG ACC TGC CTC AAA 2274 
Leu Leu Thr Tyr Glu Leu Lys Asn Trp Ser Asn Ser Thr Cys Leu Lys 
630 635 640 

CTG GCC GTG GCA GCC AAA CAC CGG GAC TTC ATT GOT CAC ACC TGC AGC 2322 
Leu Ala Val Ala Ala Lys His Arg Asp Phe lie Ala His Thr Cys Ser 
645 650 655 

CAG ATG CTG CTG ACC GAT ATG TGG ATG GGA AGA CTG CGG ATG CGG AAG 2370 
Gin Met Leu Leu Thr Asp Met Trp Met Gly Arg Leu Arg Met Arg Lys 
660 665 670 675 

AAC CCC GGC CTG AAG GTT ATC ATG GGG ATT CTT CTA CCC CCC ACC ATC 2418 
Asn Pro Gly Leu Lys Val lie Met Gly lie Leu Leu Pro Pro Thr lie 
680 685 690 

TTG TTT TTG GAA TTT CGC ACA TAT GAT GAT TTC TCG TAT CAA ACA TCC 2466 
Leu Phe Leu Glu Phe Arg Thr Tyr Asp Asp Phe Ser Tyr Gin Thr Ser 
695 700 705 

AAG GAA AAC GAG GAT GGC AAA GAA AAA GAA GAG GAA AAT ACG GAT GCA 2514 
Lys Glu Asn Glu Asp Gly Lys Glu Lys Glu Glu Glu Asn Thr Asp Ala 
710 715 720 

AAT GCA GAT GCT GGC TCA AGA AAG GGG GAT GAG GAG AAC GAG CAT AAA 2562 
Asn Ala Asp Ala Gly Ser Arg Lys Gly Asp Glu Glu Asn Glu His Lys 
725 730 735 

AAA CAG AGA AGT ATT CCC ATC GGA ACA AAG ATC TGT GAA TTC TAT AAC 2610 
Lys Gin Arg Ser lie Pro lie Gly Thr Lys lie Cys Glu Phe Tyr Asn 
740 745 750 755 

GCG CCC ATT GTC AAG TTC TGG TTT TAC ACA ATA TCA TAC TTG GGC TAC 2658 
Ala Pro lie Val Lys Phe Trp Phe Tyr Thr lie Ser Tyr Leu Gly Tyr 
760 " 765 770 

CTG CTG CTG TTT AAC TAC GTC ATC CTG GTG CGG ATG GAT GGC TGG CCG 2706 
Leu Leu Leu Phe Asn Tyr Val lie Leu Val Arg Met Asp Gly Trp Pro 
775 780 785 

TCC CTC CAG GAG TGG ATC GTC ATC TCC TAC ATC GTG AGC CTG GCG TTA 2754 
Ser Leu Gin Glu Trp He Val He Ser Tyr He Val Ser Leu Ala Leu 
790 795 800 

GAG AAG ATA CGA GAG ATC CTC ATG TCA GAA CCA GGC AAA CTC AGC CAG 2802 
Glu Lys lie Arg Glu He Leu Met Ser Glu Pro Gly Lys Leu Ser Gin 
805 810 815 

AAA ATC AAA GTT TGG CTT CAG GAG TAC TGG AAC ATC ACA GAT CTC GTG 2850 
Lys He Lys Val Trp Leu Gin Glu Tyr Trp Asn He Thr Asp Leu Val 
820 825 830 835 

GCC ATT TCC ACA TTC ATG ATT GGA GCA ATT CTT CGC CTA CAG AAC CAG 2898 
Ala He Ser Thr Phe Met He Gly Ala He Leu Arg Leu Gin Asn Gin 
840 845 850 
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CCC TAC ATG GGC TAT GGC COG GTG ATC TAC TGT GTG GAT ATC ATC TTC 2946 
Pro Tyr Met Gly Tyr Gly Arg Val lie Tyr Cys Val Asp He He Phe 
855 360 365 

TGG TAC ATC CGT GTC CTG GAC ATC TTT GGT GTC AAC AAG TAT CTG GGG 2994 
Trp Tyr He Arg Val Leu Asp He Phe Gly Val Asn Lys Tyr Leu Gly 
870 875 880 

CCA TAC GTG ATG ATG ATT GGA AAG ATG ATG ATC GAC ATG CTG TAC TTT 3042 
Pro Tyr Val Met Met He Gly Lys Met Met He Asp Met Leu Tyr Phe 
885 890 895 

GTG GTC ATC ATG CTG GTC GTG CTC ATG AGT TTC GGA GTA GCC CGT CAA 3090 
Val Val He Met Leu Val Val Leu Met Ser Phe Gly Val Ala Arg Gin 
900 905 910 915 

GCC ATT CTG CAT CCA GAG GAG AAG CCC TCT TGG AAA CTG GCC CGA AAC 3138 
Ala He Leu His Pro Glu Glu Lys Pro Ser Trp Lys Leu Ala Arg Asn 
920 925 930 

ATC TTC TAC ATG CCC TAC TGG ATG ATC TAT GGA GAG GTG TTT GCA GAC 3186 
He Phe Tyr Met Pro Tyr Trp Met He Tyr Gly Glu Val Phe Ala Asp 
935 940 945 

CAG ATA GAC CTC TAC GCC ATG GAA ATT AAT CCT CCT TGT GGT GAG AAC 3234 
Gin He Asp Leu Tyr Ala Met Glu He Asn Pro Pro Cys Gly Glu Asn 
950 955 960 

CTA TAT GAT GAG GAG GGC AAG CGG CTT CCT CCC TGT ATC CCC GGC GCC 3282 
Leu Tyr Asp Glu Glu Gly Lys Arg Leu Pro Pro Cys He Pro Gly Ala 
965 970 975 

TGG CTC ACT CCA GCA CTC ATG GCG TGC TAT CTA CTG GTC GCC AAC ATC 3330 
Trp Leu Thr Pro Ala Leu Met Ala Cys Tyr Leu Leu Val Ala Asn He 
980 985 990 995 

CTG CTG GTG AAC CTG CTG ATT GCT GTG TTC AAC AAT ACT TTC TTT GAA 3378 
Leu Leu Val Asn Leu Leu He Ala Val Phe Asn Asn Thr Phe Phe Glu 
1000 1005 1010 

GTA AAA TCA ATA TCC AAC CAG GTG TGG AAG TTC CAG CGA TAT CAG CTG 3426 
Val Lys Ser He Ser Asn Gin Val Trp Lys Phe Gin Arg Tyr Gin Leu 
1015 1020 1025 

ATT ATG ACA TTT CAT GAC AGG CCA GTC CTG CCC CCA CCG ATG ATC ATT 3474 
He Met Thr Phe His Asp Arg Pro Val Leu Pro Pro Pro Met He He 
1030 1035 1040 

TTA AGC CAC ATC TAC ATC ATC ATT ATG CGT CTC AGC GGC CGC TGC AGG 3522 
Leu Ser His He Tyr He He He Met Arg Leu Ser Gly Arg Cys Arg 
1045 1050 1055 

AAA AAG AGA GAA GGG GAC CAA GAG GAA CGG GAT CGT GGA TTG AAG CTC 3570 
Lys Lys Arg Glu Gly Asp Gin Glu Glu Arg Asp Arg Gly Leu Lys Leu 
1060 1065 1070 1075 

TTC CTT AGC GAC GAG GAG CTA AAG AGG CTG CAT GAG TTC GAG GAG CAG 3618 
Phe Leu Ser Asp Glu Glu Leu Lys Arg Leu His Glu Phe Glu Glu Gin 
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1080 



1085 



1090 



TGC 0X6 CAG GAG CAC TTC CGG GAG AAG GAG GAT GAG CAG CAG TCG TCC 3666 
Cys Val Gin Glu His Phe Arg Glu Lys Glu Asp Glu Gin Gin Ser Ser 
1095 1100 1105 

AGC GAC GAG CGC ATC CGG GTC ACT TCT GAA AGA GTT GAA AAT ATG TCA 3714 
Ser Asp Glu Arg lie Arg Val Thr Ser Glu Arg Val Glu Asn Met Ser 
1110 1115 1120 

ATG AGG TTG GAA GAA ATC AAT GAA AGA GAA ACT TTT ATG AAA ACT TCC 3762 
Met Arg Leu Glu Glu lie Asn Glu Arg Glu Thr Phe Met Lys Thr Ser 
1125 1130 1135 

CTG CAG ACT GTT GAC CTT CGA CTT GCT CAG CTA GAA GAA TTA TCT AAC 3810 
Leu Gin Thr Val Asp Leu Arg Leu Ala Gin Leu Glu Glu Leu Ser Asn 
1140 1145 1150 1155 

AGA ATG GTG AAT GCT CTT GAA AAT CTT GCG GGA ATC GAC AGG TCT GAC 3858 
Arg Met Val Asn Ala Leu Glu Asn Leu Ala Gly lie Asp Arg Ser Asp 
1160 1165 1170 



CTG ATC CAG GCA CGG TCC CGG GCT TCT TCT GAA TGT GAG GCA ACG TAT 
Leu He Gin Ala Arg Ser Arg Ala Ser Ser Glu Cys Glu Ala Thr Tyr 
1175 1180 1185 



3906 



CTT CTC CGG CAA AGC AGC ATC AAT AGC GCT GAT GGC TAC AGC TTG TAT 3954 
Leu Leu Arg Gin Ser Ser He Asn Ser Ala Asp Gly Tyr Ser Leu Tyr 
1190 1195 1200 

CGA TAT CAT TTT AAC GGA GAA GAG TTA TTA TTT GAG GAT ACA TCT CTC 4002 
Arg Tyr His Phe Asn Gly Glu Glu Leu Leu Phe Glu Asp Thr Ser Leu 
1205 1210 1215 

TCC ACG TCA CCA GGG ACA GGA GTC AGG AAA AAA ACC TGT TCC TTC COT 4050 
Ser Thr Ser Pro Gly Thr Gly Val Arg Lys Lys Thr Cys Ser Phe Arg 
1220 1225 1230 1235 

ATA AAG GAA GAG AAG GAC GTG AAA ACG CAC CTA GTC CCA GAA TGT CAG 4098 
He Lys Glu Glu Lys Asp Val Lys Thr His Leu Val Pro Glu Cys Gin 
1240 1245 1250 



AAC AGT CTT CAC CTT TCA CTG GGC ACA AGC ACA TCA GCA ACC CCA GAT 
Asn Ser Leu His Leu Ser Leu Gly Thr Ser Thr Ser Ala Thr Pro Asp 
1255 1260 1265 



4146 



GGC AGT CAC CTT GCA GTA GAT GAC TTA AAG AAC GCT GAA GAG TCA AAA 4194 
Gly Ser His Leu Ala Val Asp Asp Leu Lys Asn Ala Glu Glu Ser Lys 
1270 1275 1280 

TTA GGT CCA GAT ATT GGG ATT TCA AAG GAA GAT GAT GAA AGA CAG ACA 4242 
Leu Gly Pro Asp He Gly He Ser Lys Glu Asp Asp Glu Arg Gin Thr 
1285 1290 1295 

GAC TCT AAA AAA GAA GAA ACT ATT TCC CCA AGT TTA AAT AAA ACA GAT 4290 
Asp Ser Lys Lys Glu Glu Thr He Ser Pro Ser Leu Asn Lys Thr Asp 
1300 1305 1310 1315 
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GTG ATA CAT GGA CAG GAC AAA TCA GAT GTT CAA AAC ACT CAG CTA ACA 4338 
Val lie His Gly Gin Asp Lys Ser Asp Val Gin Asn Thr Gin Leu Thr 
1320 1325 1330 

GTG GAA ACG ACA AAT ATA GAA GGC ACT ATT TCC TAT CCC CTG GAA GAA 4386 
Val Glu Thr Thr Asn lie Glu Gly Thr lie Ser Tyr Pro Leu Glu Glu 
1335 1340 1345 

ACC AAA ATT ACA CGC TAT TTC CCC GAT GAA ACG ATC AAT GCT TGT AAA 4434 
Thr Lys lie Thr Arg Tyr Phe Pro Asp Glu Thr lie Asn Ala Cys Lys 
1350 1355 1360 

ACA ATG AAG TCC AGA AGC TTC GTC TAT TCC CGG GGA AGA AAG CTG GTC 4482 
Thr Met Lys Ser Arg Ser Phe Val Tyr Ser Arg Gly Arg Lys Leu Val 
1365 1370 1375 

GGT GGG GTT AAC CAG GAT GTA GAG TAC AGT TCA ATC ACG GAC CAG CAA 4530 
Gly Gly Val Asn Gin Asp Val Glu Tyr Ser Ser lie Thr Asp Gin Gin 
1380 1385 1390 1395 

TTG ACG ACG GAA TGG CAA TGC CAA GTT CAA AAG ATC ACG CGC TCT CAT 4578 
Leu Thr Thr Glu Trp Gin Cys Gin Val Gin Lys lie Thr Arg Ser His 
1400 1405 1410 

AGC ACA GAT ATT CCT TAC ATT GTG TCG GAA GCT GCA GTG CAA GCT GAG 4626 
Ser Thr Asp lie Pro Tyr lie Val Ser Glu Ala Ala Val Gin Ala Glu 
1415 1420 1425 

CAA AAA GAG CAG TTT GCA GAT ATG CAA GAT GAA CAC CAT GTC GCT GAA 4674 
Gin Lys Glu Gin Phe Ala Asp Met Gin Asp Glu His His Val Ala Glu 
1430 1435 1440 

GCA ATT CCT CGA ATC CCT CGC TTG TCC CTA ACC ATT ACT GAC AGA AAT 4722 
Ala lie Pro Arg He Pro Arg Leu Ser Leu Thr He Thr Asp Arg Asn 
1445 1450 1455 

GGG ATG GAA AAC TTA CTG TCT GTG AAG CCA GAT CAA ACT TTG GGA TTC 4770 
Gly Met Glu Asn Leu Leu Ser Val Lys Pro Asp Gin Thr Leu Gly Phe 
1460 1465 1470 1475 

CCA TCT CTC AGG TCA AAA AGT TTA CAT GGA CAT CCT AGG AAT GTG AAA 4818 
Pro Ser Leu Arg Ser Lys Ser Leu His Gly His Pro Arg Asn Val Lys 
1480 1485 1490 

TCC ATT CAG GGA AAG TTA GAC AGA TCT GGA CAT GCC AGT AGT GTA AGC 4866 
Ser He Gin Gly Lys Leu Asp Arg Ser Gly His Ala Ser Ser Val Ser 
1495 1500 1505 

AGC TTA GTA ATT GTG TCT GGA ATG ACA GCA GAA GAA AAA AAG GTT AAG 4914 
Ser Leu Val He Val Ser Gly Met Thr Ala Glu Glu Lys Lys Val Lys 
1510 1515 1520 

AAA GAG AAA GCT TCC ACA GAA ACT GAA TGC T AGTCTGTTTT GTTTCTTTAA 4965 
Lys Glu Lys Ala Ser Thr Glu Thr Glu Cys 
1525 1530 

TTTTTTTTTT TAACAGTCAG AAACCCACTA ATGGGTGTCA TCTTGGCCCA TCCTAAACAC 5025 
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ATMTCCAATT TCCTAAAAAC ATTTTCCCTT 



5055 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1533 amino acids 

(B) TYPE: amino acid 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

Met Tyr lie Arg Val Ser Tyr Asp Thr Lys Pro Asp Ser Leu Leu His 
15 10 15 

Leu Met Val Lys Asp Trp Gin Leu Glu Leu Pro Lys Leu Leu lie Ser 
20 25 30 

Val His Gly Gly Leu Gin Asn Phe Glu Met Gin Pro Lys Leu Lys Gin 
35 40 45 

Val Phe Gly Lys Gly Leu lie Lys Ala Ala Met Thr Thr Gly Ala Trp 
50 55 60 

He Phe Thr Gly Gly Val Ser Thr Gly Val He Ser His Val Gly Asp 
65 70 75 80 

Ala Leu Lys Asp His Ser Ser Lys Ser Arg Gly Arg Val Cys Ala He 
85 90 95 

Gly He Ala Pro Trp Gly He Val Glu Asn Lys Glu Asp Leu Val Gly 
100 105 110 

Lys Asp Val Thr Arg val Tyr Gin Thr Met Ser Asn Pro Leu Ser Lys 
115 120 125 

Leu Ser Val Leu Asn Asn Ser His Thr His Phe He Leu Ala Asp Asn 
130 135 140 

Gly Thr Leu Gly Lys Tyr Gly Ala Glu Val Lys Leu Arg Arg Leu Leu 
145 150 155 160 

Glu Lys His He Ser Leu Gin Lys He Asn Thr Arg Leu Gly Gin Gly 
165 170 175 

Val Pro Leu Val Gly Leu Val Val Glu Gly Gly Pro Asn Val Val Ser 
180 185 190 

He Val Leu Glu Tyr Leu Gin Glu Glu Pro Pro He Pro Val Val He 
195 200 205 

Cys Asp Gly Ser Gly Arg Ala Ser Asp He Leu Ser Phe Ala His Lys 
210 215 220 

Tyr Cys Glu Glu Gly Gly He He Asn Glu Ser Leu Arg Glu Gin Leu 
225 230 235 240 
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Leu Val Thr He Gin Lys Thr Phe Asn Tyr Asn Lys Ala Gin Ser His 
245 250 255 

Gin Leu Phe Ala lie lie Met Glu Cys Met Lys Lys Lys Glu Leu Val 
260 265 270 

Thr Val Phe Arg Met Gly Ser Glu Gly Gin Gin Asp lie Glu Met Ala 
275 280 285 

lie Leu Thr Ala Leu Leu Lys Gly Thr Asn Val Ser Ala Pro Asp Gin 
290 295 300 

Leu Ser Leu Ala Leu Ala Trp Asn Arg Val Asp He Ala Arg Ser Gin 
305 310 315 320 

He Phe Val Phe Gly Pro His Trp Thr Pro Leu Gly Ser Leu Ala Pro 
325 330 335 

Pro Thr Asp Ser Lys Ala Thr Glu Lys Glu Lys Lys Pro Pro Met Ala 
340 345 350 

Thr Thr Lys Gly Gly Arg Gly Lys Gly Lys Gly Lys Lys Lys Gly Lys 
355 360 365 

Val Lys Glu Glu Val Glu Glu Glu Thr Asp Pro Arg Lys He Glu Leu 
370 375 380 

Leu Asn Trp Val Asn Ala Leu Glu Gin Ala Met Leu Asp Ala Leu Val 
385 390 395 400 

Leu Asp Arg Val Asp Phe Val Lys Leu Leu He Glu Asn Gly Val Asn 
405 410 415 

Met Gin His Phe Leu Thr He Pro Arg Leu Glu Glu Leu Tyr Asn Thr 
420 425 430 

Arg Leu Gly Pro Pro Asn Thr Leu His Leu Leu Val Arg Asp Val Lys 
435 440 445 

Lys Ser Asn Leu Pro Pro Asp Tyr His He Ser Leu He Asp He Gly 
450 455 460 

Leu Val Leu Glu Tyr Leu Met Gly Gly Ala Tyr Arg Cys Asn Tyr Thr 
465 470 475 480 

Arg Lys Asn Phe Arg Thr Leu Tyr Asn Asn Leu Phe Gly Pro Lys Arg 
485 490 495 

Pro Lys Ala Leu Lys Leu Leu Gly Met Glu Asp Asp Glu Pro Pro Ala 
500 505 510 

Lys Gly Lys Lys Lys Lys Lys Lys Lys Lys Glu Glu Glu He Asp He 
515 520 525 

Asp Val Asp Asp Pro Ala Val Ser Arg Phe Gin Tyr Pro Phe His Glu 
530 535 540 
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Leu Met Val Trp Ala Val Leu Met Lys Arg Gin Lys Met Ala Val Phe 
545 550 555 560 

Leu Trp Gin Arg Gly Glu Glu Ser Met Ala Lys Ala Leu Val Ala Cys 
565 570 575 

Lys Leu Tyr Lys Ala Met Ala His Glu Ser Ser Glu Ser Asp Leu Val 
580 585 590 

Asp Asp He Ser Gin Asp Leu Asp Asn Asn Ser Lys Asp Phe Gly Gin 
595 600 605 

Leu Ala Leu Glu Leu Leu Asp Gin Ser Tyr Lys His Asp Glu Gin He 
610 615 620 

Ala Met Lys Leu Leu Thr Tyr Glu Leu Lys Asn Trp Ser Asn Ser Thr 
625 630 635 640 

Cys Leu Lys Leu Ala Val Ala Ala Lys His Arg Asp Phe He Ala His 
645 650 655 

Thr Cys Ser Gin Met Leu Leu Thr Asp Met Trp Met Gly Arg Leu Arg 
660 665 670 

Met Arg Lys Asn Pro Gly Leu Lys Val He Met Gly He Leu Leu Pro 
675 680 685 

Pro Thr He Leu Phe Leu Glu Phe Arg Thr Tyr Asp Asp Phe Ser Tyr 
690 695 700 

Gin Thr Ser Lys Glu Asn Glu Asp Gly Lys Glu Lys Glu Glu Glu Asn 
705 710 715 720 

Thr Asp Ala Asn Ala Asp Ala Gly Ser Arg Lys Gly Asp Glu Glu Asn 
725 730 735 

Glu His Lys Lys Gin Arg Ser He Pro He Gly Thr Lys He Cys Glu 
740 745 750 

Phe Tyr Asn Ala Pro He Val Lys Phe Trp Phe Tyr Thr He Ser Tyr 
755 760 765 

Leu Gly Tyr Leu Leu Leu Phe Asn Tyr Val He Leu Val Arg Met Asp 
770 775 780 

Gly Trp Pro Ser Leu Gin Glu Trp He Val He Ser Tyr He Val Ser 
785 790 795 800 

Leu Ala Leu Glu Lys He Arg Glu He Leu Met Ser Glu Pro Gly Lys 
805 810 815 

Leu Ser Gin Lys He Lys Val Trp Leu Gin Glu Tyr Trp Asn He Thr 
820 825 830 

Asp Leu Val Ala He Ser Thr Phe Met lie Gly Ala He Leu Arg Leu 
835 840 845 
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Gin Asn Gin Pro Tyr Met Gly Tyr Gly Arg Val lie Tyr Cys Val Asp 
850 355 860 

lie lie Phe Trp Tyr lie Arg Val Leu Asp lie Phe Gly Val Asn Lys 
865 870 875 880 

Tyr Leu Gly Pro Tyr Val Met Met lie Gly Lys Met Met lie Asp Met 
885 890 895 

Leu Tyr Phe Val Val He Met Leu Val Val Leu Met Ser Phe Gly Val 
900 905 910 

Ala Arg Gin Ala He Leu His Pro Glu Glu Lys Pro Ser Trp Lys Leu 
915 920 925 

Ala Arg Asn He Phe Tyr Met Pro Tyr Trp Met He Tyr Gly Glu Val 
930 935 940 

Phe Ala Asp Gin He Asp Leu Tyr Ala Met Glu He Asn Pro Pro Cys 
945 950 955 960 

Gly Glu Asn Leu Tyr Asp Glu Glu Gly Lys Arg Leu Pro Pro Cys He 
965 970 975 

Pro Gly Ala Trp Leu Thr Pro Ala Leu Met Ala Cys Tyr Leu Leu Val 
980 985 990 

Ala Asn He Leu Leu Val Asn Leu Leu He Ala Val Phe Asn Asn Thr 
995 1000 1005 

Phe Phe Glu Val Lys Ser He Ser Asn Gin Val Trp Lys Phe Gin Arg 
1010 1015 1020 

Tyr Gin Leu He Met Thr Phe His Asp Arg Pro Val Leu Pro Pro Pro 
1025 1030 1035 1040 

Met He He Leu Ser His He Tyr He He He Met Arg Leu Ser Gly 
1045 1050 1055 

Arg Cys Arg Lys Lys Arg Glu Gly Asp Gin Glu Glu Arg Asp Arg Gly 
1060 1065 1070 

Leu Lys Leu Phe Leu Ser Asp Glu Glu Leu Lys Arg Leu His Glu Phe 
1075 1080 1085 

Glu Glu Gin Cys Val Gin Glu His Phe Arg Glu Lys Glu Asp Glu Gin 
1090 1095 1100 

Gin Ser Ser Ser Asp Glu Arg He Arg Val Thr Ser Glu Arg Val Glu 
1105 1110 1115 1120 

Asn Met Ser Met Arg Leu Glu Glu lie Asn Glu Arg Glu Thr Phe Met 
1125 1130 1135 

Lys Thr Ser Leu Gin Thr Val Asp Leu Arg Leu Ala Gin Leu Glu Glu 
1140 1145 1150 
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Leu Ser Asn Arg Met Val Asn Ala Leu Glu Asn Leu Ala Gly lie Asp 
1155 1160 1165 

Arg Ser Asp Leu lie Gin Ala Arg Ser Arg Ala Ser Ser Glu Cys Glu 
1170 1175 1180 

Ala Thr Tyr Leu Leu Arg Gin Ser Ser lie Asn Ser Ala Asp Gly Tyr 
1185 1190 1195 1200 

Ser Leu Tyr Arg Tyr His Phe Asn Gly Glu Glu Leu Leu Phe Glu Asp 
1205 1210 1215 

Thr Ser Leu Ser Thr Ser Pro Gly Thr Gly Val Arg Lys Lys Thr Cys 
1220 1225 1230 

Ser Phe Arg lie Lys Glu Glu Lys Asp Val Lys Thr His Leu Val Pro 
1235 1240 1245 

Glu Cys Gin Asn Ser Leu His Leu Ser Leu Gly Thr Ser Thr Ser Ala 
1250 1255 1260 

Thr Pro Asp Gly Ser His Leu Ala Val Asp Asp Leu Lys Asn Ala Glu 
1265 1270 1275 1280 

Glu Ser Lys Leu Gly Pro Asp lie Gly lie Ser Lys Glu Asp Asp Glu 
1285 1290 1295 

Arg Gin Thr Asp Ser Lys Lys Glu Glu Thr lie Ser Pro Ser Leu Asn 
1300 1305 1310 

Lys Thr Asp Val He His Gly Gin Asp Lys Ser Asp Val Gin Asn Thr 
1315 1320 1325 

Gin Leu Thr Val Glu Thr Thr Asn He Glu Gly Thr He Ser Tyr Pro 
1330 1335 1340 

Leu Glu Glu Thr Lys He Thr Arg Tyr Phe Pro Asp Glu Thr He Asn 
1345 1350 1355 1360 

Ala Cys Lys Thr Met Lys Ser Arg Ser Phe Val Tyr Ser Arg Gly Arg 
1365 1370 1375 

Lys Leu Val Gly Gly Val Asn Gin Asp Val Glu Tyr Ser Ser He Thr 
1380 1385 1390 

Asp Gin Gin Leu Thr Thr Glu Trp Gin Cys Gin Val Gin Lys He Thr 
1395 1400 1405 

Arg Ser His Ser Thr Asp He Pro Tyr He Val Ser Glu Ala Ala Val 
1410 1415 1420 

Gin Ala Glu Gin Lys Glu Gin Phe Ala Asp Met Gin Asp Glu His His 
1425 1430 1435 1440 

Val Ala Glu Ala He Pro Arg He Pro Arg Leu Ser Leu Thr He Thr 
1445 1450 1455 
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Asp Arg Asn Gly Met Glu Asn Leu Leu Ser Val Lys Pro Asp Gin Thr 
1460 1465 1470 

Leu Gly Phe Pro Ser Leu Arg Ser Lys Ser Leu His Gly His Pro Arg 
1475 1480 1485 

Asn Val Lys Ser lie Gin Gly Lys Leu Asp Arg Ser Gly His Ala Ser 
1490 1495 1500 

Ser Val Ser Ser Leu Val lie Val Ser Gly Met Thr Ala Glu Glu Lys 
1505 1510 1515 1520 

Lys Val Lys Lys Glu Lys Ala Ser Thr Glu Thr Glu Cys 
1525 1530 
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WHAT IS CLAIMED IS: 



1 1. An isolated nucleic acid comprising the 

2 nucleotide sequence SEQ ID NO: 2 as depicted in FIG, 3, or 

3 the nucleotide sequence of a gene or gene fragment contained 

4 in the following clone as deposited with the NRRL: pFOMY030 

5 (NRRL accession No. B-21416) . 

1 2. An isolated nucleic acid which encodes the 

2 amino acid sequence depicted in Fig. 3 from amino acid 

3 residue number 1 to 542, or its complement. 

1 3. An isolated nucleic acid which hybridizes to an 

2 isolated nucleic acid which encodes the amino acid sequence 

3 depicted in Fig. 3 from amino acid residue number 1 to 542 , 

4 or its complement. 

1 4. A genetically engineered host cell comprising 

2 the nucleotide sequence of Claim 2. 

1 5. An expression vector comprising the nucleotide 

2 sequence of Claim 2 in operative association with a 

3 nucleotide sequence regulatory element that controls 

4 expression of the nucleotide sequence in a host cell. 

1 6. A substantially pure gene product encoded by the 

2 nucleic acid of Claim 3. 

1 7. An isolated nucleic acid comprising the 

2 nucleotide sequence SEQ ID NO: 6 as depicted in FIG. 5, or 

3 SEQ ID NO: 8 as depicted in FIG. 6. 
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1 8. An isolated nucleic acid which encodes the amino 

2 acid sequence depicted in Fig. 5 from amino acid residue w ' 

3 number 1 to 1497, or its complement. 

1 9. An isolated nucleic acid which encodes the amino ,y 

2 acid sequence depicted in Fig. 6 from amino acid residue 

3 number 1 to 1533, or its complement. 

1 10. An isolated nucleic acid which hybridizes to a j 

2 nucleic acid that encodes the amino acid sequence depicted 

3 in Fig. 5 from amino acid residue number 1 to 1497 (SEQ ID 

4 NO: 7), or its complement, or the amino acid sequence 

5 depicted in Fig. 6 from amino acid residue number 1 to 1533 

6 (SEQ ID NO:9), or its complement. 

1 11. A nucleotide vector containing the nucleotide 

2 sequence of Claim 10. 

1 12. A genetically engineered host cell containing 

2 the nucleotide sequence of Claim 10. 

1 13. An expression vector containing the nucleotide 

2 sequence of Claim 10 in operative association with a 

3 nucleotide sequence regulatory element that controls 

4 expression of the nucleotide sequence in a host cell. 

1 14. A substantially pure gene product encoded by 

2 the nucleic acid of Claim 10. 

1 15. An antibody that immunospecif ically binds the 

2 gene product of Claim 14. 
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1 16. A method of diagnosing tumor progression in a 

2 mammal, said method comprising: 

3 obtaining a test sample of tissue cells from the 

4 mammal; 

5 obtaining a control sample of known normal cells 

6 from the same type of tissue; and 

7 detecting in both the test sample and the control 

8 sample the level of expression of a gene transcript or gene 

9 product of gene 030, wherein a level of expression lower in 

10 the test sample than in the control sample indicates a tumor 

11 progression state in the test sample. 

1 17. A method for treating tumor progression in a 



2 mammal, said method comprising administering to the mammal a 

3 compound in an amount effective to increase the level of 

4 expression or activity of a gene transcript or gene product 

5 of gene 030 in cells exhibiting a tumor progression state, 

6 to a level effective to ameliorate symptoms of the tumor 

7 progression state. 

1 18. A method of Claim 17, wherein the compound 

2 comprises a nucleic acid whose administration results in an 

3 increase in the level of the differentially expressed gene 

4 transcript and gene product in the cells in the tumor 

5 progression state, thereby ameliorating symptoms of the 

6 tumor progression state. 



1 19. A method of Claim 18, wherein the nucleic acid 

2 comprises a nucleic acid of gene 030. 

1 20. A method of Claim 17, wherein the compound is a 

2 030 gene product. 
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1 21. A method of Claim 17, wherein the tumor 

2 progression state is neoplasia. 

1 22. A method of Claim 17, wherein the tumor 

2 progression state is metastasis. 

1 23. A method for inhibiting tumor progression in a 

2 mammal, said method comprising administering to the mammal a 

3 normal allele of a 030 gene so that the normal gene product 

4 is expressed, thereby inhibiting tumor progression. 

1 24. A method of Claim 23, wherein the tumor 

2 progression is metastasis. 

1 25. A method for treating tumor progression in a 

2 mammal, said method comprising administering to the mammal 

3 an effective amount of a 030 gene product. 

1 26. A method of Claim 25, wherein the tumor 

2 progression is metastasis. 
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1 27. A method of monitoring the efficacy of a 

2 compound in clinical trials for inhibition of tumor 

3 progression in a patient, said method comprising 

4 obtaining a first sample of tumor tissue cells from 

5 the patient; 

6 administering the compound to the patient; 

7 after a time sufficient for the compound to inhibit 

8 tumor progression , obtaining a second sample of tumor tissue 

9 cells from the patient; and 

10 detecting in the first and second samples the level 

11 of expression of a 030 gene transcript or product, wherein a 

12 level of expression higher in the second sample than in the 

13 first sample indicates that the compound is effective to 

14 inhibit tumor progression in the patient, 

1 28 • A method of Claim 27, wherein the 030 gene 

2 transcript or gene product is differentially expressed in 

3 individuals predisposed to a metastatic neoplastic disease. 
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COMPOSITIONS AND METHODS FOR THE DIAGNOSIS, 
PREVENTION AND TREATMENT OF TUMOR PROGRESSION 

ABSTRACT 

The present invention relates to methods and 
compositions for the diagnosis, prevention, and treatment of 
tumor progression in cells involved in human tumors such as 
melanomas, breast, gastrointestinal, lung, and bone tumors, 
various types of skin cancers, and other neoplastic 
conditions such as leukemias and lymphomas. Genes are 
identified that are differentially expressed in benign 
( e.g. . non-malignant) tumor cells relative to malignant 
tumor cells exhibiting a high metastatic potential* Genes 
are also identified via the ability of their gene products 
to interact with gene products involved in the progression 
to, and/ or aggressiveness of, neoplastic tumor disease 
states • The genes and gene products identified can be used 
diagnostically or for therapeutic intervention. 
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FIGURE 1 




FIGURE 2 



romy030 

GGTGCTGGAGTACCTCATGGGCGGTGCCTACCGCTGCAACTACACTCGGAAAAGCTTCCGGACTCT 

CTACAACAACTTGTTTGGCCCTAAGAGGGTAGAGCTCAGCAGACACACAGTGTCCTGTGCCTCCC 

AGAGTAACATGTGGTTCCTTGATGTGCTTCCCCAAAAGCCCACCTGTGCAGAATG 
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CTT GAG CTG CTC AAC TOG GTG MT GCC CTG GAG CAA GCC ATG CTG CAT GOT CTT GTC CTA 1520 

LEGLNWVNALEQAMLDALVL 100 

GAT CGG GTG GAC TTT GTA AAG CTC CTG ATT GAA AAC GGA CTG AAC ATG CAG CAT TTC CTC 1580 

ORVOFVKULIENGVRMQHFL 420 

ACC ATC CCG AGG CTG GAG GAG CTA TAC AAC ACC AGA CTG GGC CCA CCA AAC ACC CTT CAT 1640 

T I PRLEELYNTR LGPPMTLH -140 

CTG CTG GTG CGG GAT GTA AAG AAG ACC AAC CTT CCA CCT GAT - TAC CAC ATC ACC CTC ATT 1700 

LLVRDVKKSNLPPDVHISL,! -160 

GAT ATA GGA CTG GTG CTG GAG TAC CTC ATG GGC GGT GCC TAC CGC TGC AAC TAC ACT CGG 1760 

OrCLVLSYLMGGAYRCt4YTR 480 

AAA ACC TTC CCG ACT CTC TAC AA.C AAC TTG TTT GGC CCT AAG AGG GTA GAG CTC ACC AGA 1820 

KSFRTLYNNLFGPKRVELSR 500 

CAC ACA GTG TCC TGT GCC TCC CAG ACT AAC ATG TGG TTC CTT GAT GTG CTT CCC CAA AAG 1880 

HTVSCASQSNMW"FLCVLPQK 520 

CCC ACC TGT CCA GAA TGC AAC TCT TCA CCT CAC CTG TCC CAA ACT GAC ATC ACC CCA CCT 1940 

PTCAECNSSPKLSQTOtTPP 540 

CTG CCC TCA CACCCACrTGCAGGGCCTCCTAGCTTTCACATGCAGCCATTCAC 2018 
LP* C*£* 111 A/£>*2) 

AGTGCViCCTCrrCAAGCAT^TCTGTCCTCCCCTC 2098 

AGCCCGT TC TCC TC CCCACTGATCACTGGCt X w lt X "^ 2178 

ACAAAAGTTTAGGGCTCCCCGCTGTTC^ 22 58 

CAAGCCTTCCCCACCCTATCCCAATACACCCTATTCCTCCATCAAAATC^ 2338 

TTCTTGCTCCAGCCTCTT^^ 2418 

ACaCMXGftGftGCe TO 2498 

AGAAGCAGTTTCCACATGAAAOCGTTC3^ 2 57 8 

•nTCCTG C ' l lC rn^n 'ATCXaUSCCGAAAGCACTOU^ 2658 

ATTTACTATATATACCTTTGTCATTCTGTGGAAGCAAAAATTGCAATff 27 30 
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FIGURE 5 



AATCTTGGATAGAGAAAACCTTTTGC&A^ 
CTGTGGCCAGTTCACCAACCAGCATATCCCCCCT^ 
GTGGAGACTCAGCCTGAGAAATGGTCTGTTGC^ 
CCAGGGTGGCGGATATTCCAATAAAGCC 

MYIRVSYDTKPDSLLHLMVK 20 

ATG TAT ATC CGT GTA TCC TAT GAC ACC AAG CCA GAC TCA CTG CTC CAT CTC ATG GTG AAA 60 

DWQLELPKLLISVHGGLQNF 40 

GAT TGG CAG CTG GAA CTC CCC AAG CTC TTA ATA TCT GTG CAT GGA GGC CTC CAG AAC TTT 120 

HMQPKLKQVFGKGLIKAAMT 60 

GAG ATG CAG CCC AAG CTG AAA CAA GTC TTT GGG AAA GGC CTG ATC AAG GCT GCT ATG ACC 180 

TGAWIFTGGVSTGVISHVGD 80 

ACC GGG GCC TGG ATC TTC ACC GGG GGT GTC AGC ACA GGT GTT ATC AGC CAC GTA GGG GAT 240 

ALKDHSSKSRGRVCAIGIAP 100 

GCC TTG AAA GAC CAC TCC TCC AAG TCC AGA GGC CGG GTT TGT GCT ATA GGA ATT GCT CCA 300 

WGIVENKEDLVGKDVTRVYQ 120 

TGG GGC ATC GTG GAG AAT AAG GAA GAC CTG GTT GGA AAG GAT GTA ACA AGA GTG TAC CAG 360 

TMSNPLSKLSVLNNSHTHFI 140 

ACC ATG TCC AAC CCT CTA AGT AAG CTC TCT GTG CTC AAC AAC TCC CAC ACC CAC TTC ATC 420 

LADNGTLGKYGAEVKLRRLL 160 

CTG GCT GAC AAT GGC ACC CTG GGC AAG TAT GGC GCC GAG GTG AAG CTG CGA AGG CTG CTG 480 

EKHISLQ-KINTRL6QGVPLV 180 

GAA AAG CAC ATC TCC CTC CAG AAG ATC AAC ACA AGA CTG GGG CAG GGC GTG CCC CTC GTG 540 

GLVVEGGPNVVSIVLEYLQE 200 

GGT CTC GTG GTG GAG GGG GGC CCT AAC GTG GTG TCC ATC GTC TTG GAA TAC CTG CAA GAA 600 

EPPIPVVICDGSGRASDILS 220 

GAG CCT CCC ATC CCT GTG GTG ATT TGT GAT GGC AGC GGA CGT GCC TCG GAC ATC CTG TCC 660 

FAHKYCEEGGI INESLREQL 240 

TTT GCG CAC AAG TAC TGT GAA GAA GGC GGA ATA ATA AAT GAG TCC CTC AGG GAG CAG CTT 720 

LVTIQKTFNYNKAQSHQLFA 260 

CTA GTT ACC ATT CAG AAA ACA TTT AAT TAT AAT AAG GCA CAA TCA CAT CAG CTG TTT GGA 780 

IIMECMKKKELVTVFRMGSE 280 

ATT ATA ATG GAG TGC ATG AAG AAG AAA GAA CTC GTC ACT GTG TTC AGA ATG GGT TCT GAG 840 

GQQDIEMAILTALLKGTNVS 300 

GGC CAG CAG GAC ATC GAG ATG GCA ATT TTA ACT GCC CTG CTG AAA GGA ACA AAC GTA TCT 900 

APDQLSLAIiAWNRVDIARSQ 320 

GCT CCA GAT CAG CTG AGC TTG GCA CTG GCT TGG AAC CGC GTG GAC ATA GCA CGA AGC CAG 960 

IFVFGPHWTPLGSLAPPTDS 340 

ATC TTT GTC TTT GGG CCC CAC TGG ACQ CCC CTG GGA AGC CTG GCA CCC CCG ACG GAC AGC 1020 

KATEKEKKPPMATTKGGRGK 360 

AAA GCC ACG GAG AAG GAG AAG AAG CCA CCC ATG GCC ACC ACC AAG GGA GGA AGA GGA AAA 1080 

GKGKKKGKVKEEVE EETDPR 380 

GGG AAA GGC AAG AAG AAA GGG AAA GTG AAA GAG GAA GTG GAG GAA GAA ACT GAC CCC CGG 1140 

KIELLNWVNALEQAMLDALV 400 
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AAG ATA GAG CTG CTG AAC TGG GTG AAT GCT TTG GAG CAA GCG ATG CTA GAT GCT TTA GTC 1200 

LDRVDFVKLLIENGVNMQHF 420 

TTA GAT CGT GTC GAC TTT GTG AAG CTC CTG ATT GAA AAC GGA GTG AAC ATG CAA CAC TTT 1260 

LTIPRLEELYNTRLGPPNTL 440 

CTG ACC ATT CCG AGG CTG GAG GAG CTT TAT AAC ACA AGA CTG GGT CCA CCA AAC ACA CTT 1320 

HLLVRDVKKSNLPPDYHISL 460 

CAT CTG CTG GTG AGG GAT GTG AAA AAG AGO AAC CTT CCG CCT GAT TAC CAC ATC AGC CTC 1380 

IDIGLVLEYLMGGAYRCNYT 480 

ATA GAC ATC GGG CTC GTG CTG GAG TAC CTC ATG GGA GGA GCC TAC CGC TGC AAC TAC ACT 1440 

RKNFRTLYNNLFGPKRPKAL 500 

CGG AAA AAC TTT CGG ACC CTT TAC AAC AAC TTG TTT GGA CCA AAG AGG CCT AAA GCT CTT 1500 

KLLGMEDDEPPAKGKKKKKK 520 

AAA CTT CTG GGA ATG GAA GAT GAT GAG CCT CCA GCT AAA GGG AAG AAA AAA AAA AAA AAG 1560 

KKEEEIDIDVDDPAVSRFQY 540 

AAA AAG GAG GAA GAG ATC GAC ATT GAT GTG GAC GAC CCT GCC GTG AGT CGG TTC CAG TAT 1620 

PFHELMVWAVLMKRQKMAVF 560 

CCC TTC CAC GAG CTG ATG GTG TGG GCA GTG CTG ATG AAA CGC CAG AAA ATG GCA GTG TTC 1680 

LWQRGEESMAKALVACKLYK 580 

CTC TGG CAG CGA GGG GAA GAG AGC ATG GCC AAG GCC CTG GTG GCC TGC AAG CTC TAC AAG 1740 

AMAHESSESDLVDDISQDL'D 600 

GCC ATG GCC CAC GAG TCC TCC GAG AGT GAT CTG GTG GAT GAC ATC TCC CAG GAC TTG GAT 1800 

NNSKDFGQLALELLDQSYKH 620 

AAC AAT TCC AAA GAC TTC GGC CAG CTT GCT TTG GAG TTA TTA GAC CAG TCC TAT AAG CAT 1860 

DEQIAMKLLTYELKNWSNST 640 

GAC GAG CAG ATC GCT ATG AAA CTC CTG ACC TAC GAG CTG AAA AAC TGG AGC AAC TCG ACC 1920 

CLKLAVAAKHRDFIAHTCSQ 660 

TGC CTC AAA CTG GCC GTG GCA GCC AAA CAC CGG GAC TTC ATT GCT CAC ACC TGC AGC CAG 1980 

MLLTDMWMGRLRMRKNPGLK 680 

ATG CTG CTG ACC GAT ATG TGG ATG GGA AGA CTG CGG ATG CGG AAG AAC CCC GGC CTG AAG 2040 

V IMG I LLP PT I LFLEFRTY D 700 

GTT ATC ATG GGG ATT CTT CTA CCC CCC ACC ATC TTG TTT TTG GAA TTT CGC ACA TAT GAT 2100 

DFSYQTSKENEDGKEKEE EN 720 

GAT TTC TCG TAT CAA ACA TCC AAG GAA AAC GAG GAT GGC AAA GAA AAA GAA GAG GAA AAT 2160 

TDANADAGSRKGDEENEHKK 740 

ACG GAT GCA AAT GCA GAT GCT GGC TCA AGA AAG GGG GAT GAG GAG AAC GAG CAT AAA AAA 2220 

QRIIPIGTKICKF"YNAPIVK 760 

CAA AGA ATT ATC CCC ATC GGA ACA AAA ATC TGT AAA TTC TAT AAC GCG CCC ATT GTC AAG 2280 

FWFYTISYLGYLLLFNYVIL 780 

TTC TGG TTT TAC ACA ATA TCA TAC TTG GGC TAC CTG CTG CTG TTT AAC TAC GTC ATC CTG 2340 

VRMDGWPSLQEWIVISYIVS 800 

GTG CGG ATG GAT GGC TGG CCG TCC CTC CAG GAG TGG ATC GTC ATC TCC TAC ATC GTG AGC 2400 

LALEKIREILMSEPGKLSQK 820 

CTG GCG TTA GAG AAG ATA CGA GAG ATC CTC ATG TCA GAA CCA GGC AAA CTC AGC CAG AAA 2460 
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IKVWLQEYWNITDLVAXSTF 840 

ATC AAA GTT TGG CTT CAG GAG TAC TGG AAC ATC ACA GAT CTC GTG GCC ATT TCC ACA TTC 2520 

MIGAMATRSVMMIGKMMIDM 860 

ATG ATT GGA GCA ATG GCC ACG AGA TCT GTG ATG ATG ATT GGA AAG ATG ATG ATC GAC ATG 2580 

LYFVVIMLVVLMSFGVARQA 880 

CTG TAC TTT GTG GTC ATC ATG CTG GTC GTG CTC ATG AGT TTC GGA GTA GCC CGT CAA GCC 2640 

ILHPEEKPSWKLARNIFYMP 900 

ATT CTG CAT CCA GAG GAG AAG CCC TCT TGG AAA CTG GCC CGA AAC ATC TTC TAC ATG CCC 2700 

YWMIYGEVFADQIDLYAMEI 920 

TAC TGG ATG ATC TAT GGA GAG GTG TTT GCA GAC CAG ATA GAC CTC TAC GCC ATG GAA ATT 2760 

NPPCGENLYDEEGKRLPPCI 940 

AAT CCT CCT TGT GGT GAG AAC CTA TAT GAT GAG GAG GGC AAG CGG CTT CCT CCC TGT ATC 2820 

PGAWLTPALMACYLLVANIL 960 

CCC GGC GCC TGG CTC ACT CCA GCA CTC ATG GCG TGC TAT CTA CTG GTC GCC AAC ATC CTG 2880 

LVNLL IAVFNNTFFEVKSIS 980 

CTG GTG AAC CTG CTG ATT GOT GTG TTC AAC AAT ACC TTC TTT GAA GTA AAA TCA ATA TCC 2940 

NQVWKFQRYQL IMTFHDRPV 1000 

AAC CAG GTG TGG AAG TTC CAG CGA TAT CAG CTG ATT ATG ACA TTT CAT GAC AGG CCA GTC 3000 

LPPPMIILSHIVXIIMRLSG 1020 

CTG CCC CCA CCG ATG ATC ATT TTA AGC CAC ATC TAC ATC ATC ATT ATG CGT CTC AGC GGC 3060 

RCRKKREGDQEERDRG LKLF 1040 

CGC TGC AGG AAA AAG AGA GAA GGG GAC CAA GAG GAA CGG GAT CGT GGA TTG AAG CTC TTC 3120 

LSDEELKRLHEFEEQCVQEH 1060 

CTT AGC GAC GAG GAG CTA AAG AGG CTG CAT GAG TTC GAG GAG CAG TGC GTG CAG GAG CAC 3180 

FREKEDEQQSSSDERIRVTS 1080 

TTC CGG GAG AAG GAG GAT GAG CAG CAG TCG TCC AGC GAC GAG CGC ATC CGG GTC ACT TCT 3240 

ERVENMSMRLEEINERETFM 1100 

GAA AGA GTT GAA AAT ATG TCA ATG AGG TTG GAA GAA ATC AAT GAA AGA GAA ACT TTT ATG 3300 

KTSLQTVDLRLAQLEELSNR 1120 

AAA ACT TCC CTG CAG ACT GTT GAC CTT CGA CTT GCT CAG CTA GAA GAA TTA TCT AAC AGA 3360 

MVNALENLAGXDRSDLXQAR 1140 

ATG GTG AAT GCT CTT GAA AAT CTT GCG GGA ATC GAC AGG TCT GAC CTG ATC CAG GCA CGG 3420 

SRASSECEATYLLRQSSINS 1160 

TCC CGG GCT TCT TCT GAA TGT GAG GCA ACG TAT CTT CTC CGG CAA AGC AGC ATC AAT AGC 3480 

ADGYSLYRYHFNGEELLFED 1180 

GCT GAT GGC TAC AGC TTG TAT CGA TAT CAT TTT AAC GGA GAA GAG TTA TTA TTT GAG GAT 3540 

TSLSTSPGTGVRKKTCSFRI 1200 

ACA TCT CTC TCC ACG TCA CCA GGG ACA GGA GTC AGG AAA AAA ACC TGT TCC TTC CGT ATA 3600 

KEEKDVKTHLVPECQNSLHL 1220 

AAG GAA GAG AAG GAC GTG AAA ACG CAC CTA GTC CCA GAA TGT CAG AAC AGT CTT CAC CTT 3660 

SLGTSTSATPDGSHLAVDDL 1240 

TCA CTG GGC ACA AGC ACA TCA GCA ACC CCA GAT GGC AGT CAC CTT GCA CTA GAT GAC TTA 3720 
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KNAEESKLGPDIGISKEDDE 1260 

AAG AAC GCT GAA GAG TCA AAA TTA GGT CCA GAT ATT GGG ATT TCA AAG GAA GAT GAT GAA 3730 

RQTDSKKEETISPSLNKTDV 1280 

AGA CAG ACA GAC TCT AAA AAA GAA GAA ACT ATT TCC CCA AGT TTA AAT AAA ACA GAT GTG 3840 

IHGQDKSDVQNTQLTVETTN 1300 

ATA CAT GGA CAG GAC AAA TCA GAT GTT CAA AAC ACT CAG CTA ACA GTG GAA ACG ACA AAT 3900 

IEGTISYPLEETKITRYFPD 1320 

ATA GAA GGC ACT ATT TCC TAT CCC CTG GAA GAA ACC AAA ATT ACA CGC TAT TTC CCC GAT 3960 

ETINACKTMKSRSFVYSRGR 134.0 

GAA ACG ATC AAT GCT TGT AAA ACA ATG AAG TCC AGA AGC TTC GTC TAT TCC CGG GGA AGA 4020 

KLVGGVNQDVEYSS ITDQQL 1360 

AAG CTG GTC GGT GGG GTT AAC CAG GAT GTA GAG TAC AGT TCA ATC ACG GAC CAG CAA TTG 4080 

TTEWQCQVQKITRSHSTDIP 1380 

ACG ACG GAA TGG CAA TGC CAA GTT CAA AAG ATC ACG CGC TCT CAT AGC ACA GAT ATT CCT 4140 

YIVSEAAVQAEQKEQFADMQ 1400 

TAC ATT GTG TCG GAA GCT GCA GTG CAA GCT GAG CAA AAA GAG CAG TTT GCA GAT ATG CAA 4200 

DEHHVAEAI PRI PRLSLT IT 1420 

GAT GAA CAC CAT GTC GCT GAA GCA ATT CCT CGA ATC CCT CGC TTG TCC CTA ACC ATT ACT 4260 

DRNGMENLLSVKPDQTLGFP 1440 

GAC AGA AAT GGG ATG GAA AAC TTA CTG TCT GTG AAG CCA GAT CAA ACT TTG GGA TTC CCA 4320 

SLRSKSLHGHPRNVKSIQGK 1460 

TCT CTC AGG TCA AAA AGT TTA CAT GGA CAT CCT AGG AAT GTG AAA TCC ATT CAG GGA AAG 4380 

LDRSGHASSVSSLVIVSGMT ■ 1430 

TTA GAC AGA TCT GGA CAT GCC AGT AGT GTA AGC AGC TTA GTA ATT GTG TCT GGA ATG ACA 4440 

AEEKKVKKEKASTETEC (SEQ ID NO: 7) 1497 

GCA GAA GAA AAA AAG GTT AAG AAA GAG AAA GCT TCC ACA GAA ACT GAA TGC TAG TCTGTT 4991 

TTGTTICTIT A A TITITITITIT AACAGT^ 

TTCCTAAAAACATTTTCCCT ( SE Q 10 NO: 6) 
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ACTCATTATAOX3NTCGAG^ 

AATCTTGGATAGAGAAAACCTTTTGCAAACGGGAA^ 

CTGTGGCCAGTTCACCAACCAGCATATCXICCCCT^ 

GTGGAGACTCAGCCTGAGAAATGGTCTGT^^ 

CCAGGGTGGCGGATATTCCAATAAAGCC 

MYIRVSYDTKPDSLLHLMVK 20 

ATG TAT ATC CGT GTA TCC TAT GAC ACC AAG CCA GAC TCA CTG CTC CAT CTC ATG GTG AAA 60 

DWQLEtiPKLLlSVHGGLQNF 40 

GAT TGG CAG CTG GAA CTC CCC AAG CTC TTA ATA TCT GTG CAT GGA GGC CTC CAG AAC TTT 120 

EMQPKLKQVFGKGLIKAAMT 60 

GAG ATG CAG CCC AAG CTG AAA CAA GTC TTT GGG AAA GGC CTG ATC AAG GCT GCT ATG ACC 180 

TGAWIFTGGVSTGVISHVGD 80 

ACC GGG GCC TGG ATC TTC ACC GGG GGT GTC A3C ACA GGT GTT ATC AGC CAC GTA GGG GAT 240 

ALKDHSSKSRGRVCAIGIAP 100 

GCC TTG AAA GAC CAC TCC TCC AAG TCC AGA GGC CGG GTT TGT GCT ATA GGA ATT GCT CCA 300 

WGIVENKEDLVGKDVTRVYQ 120 

TGG GGC ATC GTG GAG AAT AAG GAA GAC CTG GTT GGA AAG GAT GTA ACA AGA GTG TAC CAG 360 

TMSNPLSKLSVLNNSHTHFI 140 

ACC ATG TCC AAC CCT CTA AGT AAG CTC TCT GTG CTC AAC AAC TCC CAC ACC CAC TTC ATC 420 

LADNGTLGKYGAEVKLRRLL 160 

CTG GCT GAC AAT GGC ACC CTG GGC AAG TAT GGC GCC GAG GTG AAG CTG CGA AGG CTG CTG 480 

EKH I SLQK I NTRL 'GQGVPLV 180 

GAA AAG CAC ATC TCC CTC CAG AAG ATC AAC ACA AGA CTG GGG CAG GGC GTG CCC CTC GTG 540 

GLVVEGGPNVVS IVLEYLQE 200 

GGT CTC GTG GTG GAG GGG GGC CCT AAC GTG GTG TCC ATC GTC TTG GAA TAC CTG CAA GAA 600 

EPPIPVVICDGSGRASDILS 220 

GAG CCT CCC ATC CCT GTG GTG ATT TGT GAT GGC AGC GGA CGT GCC TCG GAC ATC CTG TCC 660 

FAHKYCEEGGI INESLREQL 240 

TTT GCG CAC AAG TAC TGT GAA GAA GGC GGA ATA ATA AAT GAG TCC CTC AGG GAG CAG CTT 720 

LVTIQKTFNYNKAQSHQLFA 260 

CTA GTT ACC ATT CAG AAA ACA TTT AAT TAT AAT AAG GCA CAA TCA CAT CAG CTG TTT GCA 780 

IIMECMKKKELVTVFRMGS E 280 

ATT ATA ATG GAG TGC ATG AAG AAG AAA GAA CTC GTC ACT GTG TTC AGA ATG GGT TCT GAG 840 

GQQD IEMAILTALLKGTNVS 300 

GGC CAG CAG GAC ATC GAG ATG GCA ATT TTA ACT GCC CTG CTG AAA GGA ACA AAC GTA TCT 900 

APDQLSLALAWNRVDIAR SQ 320 

GCT CCA GAT CAG CTG AGC TTG GCA CTG GCT TGG AAC CGC GTG GAC ATA GCA CGA AGC CAG 960 

IFVFGPHWTPLGSLAPPTDS 340 

ATC TTT GTC TTT GGG CCC CAC TGG ACG CCC CTG GGA AGC CTG GCA CCC CCG ACG GAC AGC 1020 

KATEKEKKP PMATTKGGR G K 360 

AAA GCC ACG GAG AAG GAG AAG AAG CCA CCC ATG GCC ACC ACC AAG GGA GGA AGA GGA AAA 1080 

GKGKKKGKVKEEV EEETDPR 380 

GGG AAA GGC AAG AAG AAA GGG AAA GTG AAA GAG GAA GTG GAG GAA GAA ACT GAC CCC CGG 1140 

KIELLNWVNALEQAMLDALV 400 
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AAG ATA GAG CTG CTG AAC TGG GTG AAT GCT TTG GAG CAA GCG ATG CTA GAT GCT TTA GTC 1200 

LDRVDFVKLLIENGVNMQHF 420 

TTA GAT CGT GTC GAC TTT GTG AAG CTC CTG ATT GAA AAC GGA GTG AAC ATG CAA CAC TIT 1260 

LTIPRLEELYNTRLGPPNTL 440 

CTG ACC ATT CCG AGG CTG GAG GAG CTT TAT AAC ACA AGA CTG GGT CCA CCA AAC ACA CTT 1320 

HLLVRDVKKSNLPPDYHISL 460 

CAT CTG CTG GTG AGG GAT GTG AAA AAG AGO AAC CTT CCG CCT GAT TAC CAC ATC AGC CTC 1380 

IDIGLVLEYLMGGAYRCNYT 480 

ATA GAC ATC GGG CTC GTG CTG GAG TAC CTC ATG GGA GGA GCC TAC CGC TGC AAC TAC ACT 1440 

RKNFRTLYNNLFGPKRPKAL 500 

CGG AAA AAC TTT CGG ACC CTT TAC AAC AAC TTG TTT GGA CCA AAG AGG CCT AAA GCT CTT 1500 

KLLGMSDDEPPAKGKKKKKK 520 

AAA CTT CTG GGA ATG GAA GAT GAT GAG CCT CCA GCT AAA GGG AAG AAA AAA AAA AAA AAG 1560 

KKEEEIDIDVDDPAVSRFQY 540 

AAA AAG GAG GAA GAG ATC GAC ATT GAT GTG GAC GAC CCT GCC GTG AGT CGG TTC CAG TAT 1620 

PFHELMVWAVLMKRQKMAVF 560 

CCC TTC CAC GAG CTG ATG GTG TGG GCA GTG CTG ATG AAA CGC CAG AAA ATG GCA GTG TTC 1680 

LWQRGEESMAKALVACKLYK 580 

CTC TGG CAG CGA GGG GAA GAG AGC ATG GCC AAG GCC CTG GTG GCC TGC AAG CTC TAC AAG 1740 

AMAHESSES DLVD D I SQDLD 600 

GCC ATG GCC CAC GAG TCC TCC GAG AGT GAT CTG GTG GAT GAC ATC TCC CAG GAC TTG GAT 1800 

NNSKDFGQLALELLDQSYKH 620 

AAC AAT TCC AAA GAC TTC GGC CAG CTT GCT TTG GAG TTA TTA GAC CAG TCC TAT AAG CAT 1860 

DEQIAMKLLTYELKNWSNST 640 

GAC GAG CAG ATC GCT ATG AAA CTC CTG ACC TAC GAG CTG AAA AAC TGG AGC AAC TCG ACC 1920 

CLKLAVAAKHRDFIAHTCSQ 660 

TGC CTC AAA CTG GCC GTG GCA GCC AAA CAC CGG GAC TTC ATT GCT CAC ACC TGC AGC CAG 1980 

MLLTDMWMGRLRMRKNPGLK 680 

ATG CTG CTG ACC GAT ATG TGG ATG GGA AGA CTG CGG ATG CGG AAG AAC CCC GGC CTG AAG 2040 

VIMGILLPPTILFLEFRTY D 700 

GTT ATC ATG GGG ATT CTT CTA CCC CCC ACC ATC TTG TTT TTG GAA TTT CGC ACA TAT GAT 2100 

DFSYQTSKENEDGKEKSEEN 720 

GAT TTC TCG TAT CAA ACA TCC AAG GAA AAC GAG GAT GGC AAA GAA AAA GAA GAG GAA AAT 2160 

TDANADAGSRKGDEENEHKK 740 

ACG GAT GCA AAT GCA GAT GCT GGC TCA AGA AAG GGG GAT GAG GAG AAC GAG CAT AAA AAA 2220 

QRIIPIGTKICKF YNAPIVK 760 

CAA AGA ATT ATC CCC ATC GGA ACA AAA ATC TGT AAA TTC TAT AAC GCG CCC ATT GTC AAG 2280 

FWFYTISYLGYLLLFNYVIL 780 

TTC TGG TXT TAC ACA ATA TCA TAC TTG GGC TAC CTG CTG CTG TTT AAC TAC GTC ATC CTG 2340 

VRMDGWPSLQEWIVISYIVS 800 

GTG CGG ATG GAT GGC TGG CCG TCC CTC CAG GAG TGG ATC GTC ATC TCC TAC ATC GTG AGC 2400 

LALEKIREILMSEPGKLSQK 820 

CTG GCG TTA GAG AAG ATA CGA GAG ATC CTC ATG TCA GAA CCA GGC AAA CTC AGC CAG AAA 2460 
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IKVWLQEYWNITDLVAISTF 840 

ATC AAA GTT TGG CTT CAG GAG TAG TGG AAC ATC ACA GAT CTC GTG GCC ATT TCC ACA TTC 2S20 

MIGAILRLQNQPYMGYGRVI 860 

ATG ATT GGA GCA ATT CTT CGC CTA CAG AAC CAG CCC TAC ATG GGC TAT GGC CGG GTG ATC 2580 

YCVDI IFWYIRVLDIFGVNK 880 

TAC TGT GTG GAT ATC ATC TTC TGG TAC ATC CGT GTC CTG GAC ATC TTT GGT GTC AAC AAG 2640 

YLGPYVMMIGKMMlDMIiYFV 900 

TAT CTG GGG CCA TAC GTG ATG ATG ATT GGA AAG ATG ATG ATC GAC ATG CTG TAC TTT GTG 2700 

VIMLVVLMS FGVARQAILKP 920 

GTC ATC ATG CTG GTC GTG CTC ATG AGT TTC GGA GTA GCC CGT CAA GCC ATT CTG CAT CCA 2760 

EEKPSWKIiARNIFYMPYWMI 940 

GAG GAG AAG CCC TCT TGG AAA CTG GCC CGA AAC ATC TTC TAC ATG CCC TAC TGG ATG ATC 2820 

YGEVFADQIDLYAME I N P PC 960 

TAT GGA GAG GTG TTT GCA GAC CAG ATA GAC CTC TAC GCC ATG GAA ATT AAT CCT CCT TGT 2880 

GENXjYDEEGKRLPPCI PGAW 980 

GGT GAG AAC CTA TAT GAT GAG GAG GGC AAG CGG CTT CCT CCC TGT ATC CCC GGC GCC TGG 2940 

LTPALMACYLLVANILLVNL 1000 

CTC ACT CCA GCA CTC ATG GCG TGC TAT CTA CTG GTC GCC AAC ATC CTG CTG GTG AAC CTG 3000 

L IAVFNNTFFEV^KS ISNQVW 1020 

CTG ATT GCT GTG TTC AAC AAT ACT TTC TTT GAA GTA AAA TCA ATA TCC AAC CAG GTG TGG 3060 

KFQRYQLIMTFHDRPVLPPP 1040 

AAG TTC CAG CGA TAT CAG CTG ATT ATG ACA TTT CAT GAC AGG CCA GTC CTG CCC CCA CCG 3120 

MI I LSHIYI I IMRLSGRCRK 1060 

ATG ATC ATT TTA AGC CAC ATC TAC ATC ATC ATT ATG CGT CTC AGC GGC CGC TGC AGG AAA 3180 

KRSGDQEERDRGLKLFLSDE 1080 

AAG AGA GAA GGG GAC CAA GAG GAA CGG GAT CGT GGA TTG AAG CTC TTC CTT AGC GAC GAG 3240 

ELKRLHEFEEQCVQEHFREK 1100 

GAG CTA AAG AGG CTG CAT GAG TTC GAG GAG CAG TGC GTG CAG GAG CAC TTC CGG GAG AAG 3300 

EDEQQSSSDERIRVTSERVE 1120 

GAG GAT GAG CAG CAG TCG TCC AGC GAC GAG CGC ATC CGG GTC ACT TCT GAA AGA GTT GAA 3360 

NMSMRLEEINER . ETFMKTSL 1140 

AAT ATG TCA ATG AGG TTO GAA GAA ATC AAT GAA AGA GAA ACT TTT ATG AAA ACT TCC CTG 3420 

QTVDLR LAQLEELSNRMVNA 1160 

CAG ACT GTT GAC CTT CGA CTT GCT CAG CTA GAA GAA TTA TCT AAC AGA ATG GTG AAT GCT 3480 

LENLAGIDRSDL I QARSRAS 1180 

CTT GAA AAT CTT GCG GGA ATC GAC AGG TCT GAC CTG ATC CAG GCA CGG TCC CGG GCT TCT 3540 

S-ECEATYLLRQSSINSADGY 1200 

TCT GAA TGT GAG GCA ACG TAT CTT CTC CGG CAA AGC AGC ATC AAT AGC GCT GAT GGC TAC 3600 

SLYRYHFNGEELLFEDTSLS 1220 

AGC TTG TAT CGA TAT CAT TTT AAC GGA GAA GAG TTA TTA TTT GAG GAT ACA TCT CTC TCC 3660 

TSPGTGVRKKTCSFRIKEEK 1240 

ACG TCA CCA GGG ACA GGA GTC AGG AAA AAA ACC TGT TCC TTC CGT ATA AAG GAA GAG AAG 3720 
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DVK ^HLVPECQNSLHLSLGT 
GAC GTG AAA ACG CAC CTA GTC CCA GAA TGT CAG AAC AGT CTT CAC CTT TCA CTG GGC ACA 

S ^SATPDGSHLAVDDLKNAE 
AGC ACA TCA GCA ACC CCA GAT GGC AGT CAC CTT GCA GTA GAT GAC TTA AAG AAC GCT GAA 

2 SKLGPDIGISKEDDERQTD 
GAG TCA AAA TTA GGT CCA GAT ATT GGG ATT TCA AAG GAA GAT GAT GAA AGA CAG ACA GAC 

SKKEETISPSLNKTDVIHGQ 
TCT AAA AAA GAA GAA ACT ATT TCC CCA AGT TTA AAT AAA ACA GAT GTG ATA CAT GGA CAG 

DKSDVQNTQLTVETTNIEGT 
GAC AAA TCA GAT GTT CAA AAC ACT CAG CTA ACA GTG GAA ACG ACA AAT ATA GAA GGC ACT 

I SYPLEETKXTRYFPDETIN 
ATT TCC TAT CCC CTG GAA GAA ACC AAA ATT ACA CGC TAT TTC CCC GAT GAA ACG ATC AAT 

ACKTMKSRSFVYSRGRKLVG 
GCT TGT AAA ACA ATG AAG TCC AGA AGC TTC GTC TAT TCC CGG GGA AGA AAG CTG GTC GGT 

GVNQDVEYSSITDQQLTTEW 
GGG GTT AAC CAG GAT GTA GAG TAC AGT TCA ATC ACG GAC CAG CAA TTG ACG ACG GAA TGG 

QCQVQKITRSHSTDIPYIVS 
CAA TGC CAA GTT CAA AAG ATC ACG CGC TCT CAT AGC ACA GAT ATT CCT TAC ATT GTG TCG 

EAAVQAEQKEQFADMQDEHH 
GAA GCT GCA GTG CAA GCT GAG CAA AAA GAG CAG TTT GCA GAT ATG CAA GAT GAA CAC CAT 

VAEAIPRIPRLSLTITDRNG 
GTC GCT GAA GCA ATT CCT CGA ATC CCT CGC TTG TCC CTA ACC ATT ACT GAC AGA AAT GGG 

MENLLSVKPDQTLGFP SLRS 
ATG GAA AAC TTA CTG TCT GTG AAG CCA GAT CAA ACT TTG GGA TTC CCA TCT CTC AGG TCA 

KSLHGHPRNVKSIQGKLDRS 
AAA AGT TTA CAT GGA CAT CCT AGG AAT GTG AAA TCC ATT CAG GGA AAG TTA GAC AGA TCT 

GHASSVSSLVIVSGMTAEEK 
GGA CAT GCC AGT AGT GTA AGC AGC TTA GTA ATT GTG TCT GGA ATG ACA GCA GAA GAA AAA 

KVKKEKASTETEC* (SEQ ID NO: 9) 
AAG GTT AAG AAA GAG AAA GCT TCC ACA GAA ACT GAA TGC TAG TCTGTTTTGTTTCTTTAATTTTT 

TTTTTTAACAGTCAGAAACrCACT^ 

tccct (SEQ ID NO: 8) 



1260 

3780 

1280 
3840 

1300 
3900 

1320 
3960 

1340 
4020 

1360 
4080 

1380 
4140 

1400 
4200 

1420 
4260 

1440 
4320 

1460 
4380 

1480 
4440 

1500 
4500 

1520 
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4599 
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PATENT 

ATTORNEx DOCKET NO: 07334/004001 
COMBINED DECLARATION AND POWER OF ATTORNEY 
As a below named inventor, I hereby declare that: 

My residence, post office address and citizenship are as stated below next to my name, 

I believe I am the original, first and sole inventor (if only one name is listed below) or an original, 
first and joint inventor (if plural names are listed below) of the subject matter which is claimed and for 
which a patent is sought on the invention entitled COMPOSITIONS AND METHODS FOR THE 
DIAGNOSIS, PREVENTION AND TREATMENT OF TUMOR PROGRESSION, the specification of 
which 

■ was filed on March 29, 1996 as Application Serial No. 08/623,679 and was amended on . 

I hereby state that I have reviewed and understand the contents of the above-identified 
specification, including the claims, as amended by any amendment referred to above. 

I acknowledge the duty to disclose all information I know to be material to patentability in 
accordance with Title 37, Code of Federal Regulations, §1.56(a). 

I hereby claim the benefit under Title 35, United States Code, §120 of any United States 
application^) listed below and, insofar as the subject matter of each of the claims of this application is not 
disclosed in the prior United States application in the manner provided by the first paragraph of Title 35, 
United States Code, §112, 1 acknowledge the duty to disclose all information I know to be material to 
patentability as defined in Title 37, Code of Federal Regulations, §1.56(a) which became available between 
the filing date of the prior application and the national or PCT international filing date of this application: 

U.S. SERIAL NO. FILING DATE STATUS 

08/412,431 March 29. 1995 ■ Pending □ Issued □ Abandoned 

I hereby appoint the following attorneys and/or agents to prosecute this application and to 
transact all business in the Patent and Trademark Office connected therewith: X Peter Fasse. 
Reg. No. 32,983. Mark F. Boshar. Reg. No. 35.456. William E. Booth, Reg. No. 28,933; Barry E. 
Bretschneider, Reg, No. 28,055; John W. Freeman, Reg. No. 29,066; Timothy A. French, Reg. No. 30,175; 
Alan H. Gordon, Reg. No. 26,168; John F. Land, Reg. No. 29,554; John B. Pegram, Reg. No. 25,198; Rene 
D. Tegtmeyer, Reg. No. 33,567; Hans R. Troesch, Reg. No. 36,950; Dorothy P. Wheian, Reg. No. 33,814; 
Charles C Winchester, Reg. No. 21,040. 

Address all telephone calls to J. Peter Fasse at telephone number 617/542-5070. 

Address all correspondence to J. Peter Fasse. Fish & Richardson P.C., 225 Franklin Street , 
Boston, MA 02110-2804. 

I hereby declare that all statements made herein of my own knowledge are true and that all 
statements made on information and belief are believed to be true; and further that these statements were 
made with the knowledge that willful false statements and the like so made are punishable by fine or 
imprisonment, or both, under Section 1001 of Title 18 of the United States Code and that such willful 
false statements may jeopardize the validity of the application or any patents issued thereon. 
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